Upload
leah-curran
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
GM, Agata Meeting, Padova, May 2003
1
A DAQ Architecture for the Agata experiment
Gaetano Maron
INFN – Laboratori Nazionali di Legnaro
GM, Agata Meeting, Padova, May 2003
2
Outline• On-line Computing Requirements
• Event Builder
• Technologies for the On-line System
• Run Control and slow Control
• Agata Demonstrator 2003-2007
• Off-line Infrastructure
• Conclusions
GM, Agata Meeting, Padova, May 2003
3
Agata Global On-line Computing RequirementsFront-end electronic and pre-processing
Pulse Shape Analysis
Event Builder
Tracking
Storage
1000 Gbps (4 Gbps x 200)
Max 10 Gbps (50 Mbps x 200)
10 Gbps
1 Gbps
5 x 103 SI95
3 x 105 SI95 (no GLT)3 x 104 SI95 (GLT @ 30 kHz)
SI95 = SpecInt 951 SI95 = 10 Cern Unit = 40 MIPS
1.5 x 106 SI95(?) (present algorithm)
GLT = Global Level Trigger
GM, Agata Meeting, Padova, May 2003
4
Pulse Shape Analysis Farm
1.5 MSI95 (now)
PC
50 Mbps x 200
PSA Farm
1 Gbps x 200
PREAM
FADC
FADC
DSPDSPDSPDSPFPGA
Detector i=1-200
Mux
GM, Agata Meeting, Padova, May 2003
5
Event Building: simple case
R200
T01T02T03T04
T05T06T07T08
T09T10T11T12
T01T02
T05T06T07
T09
T01T02T04
T05T06
T09T10T12
T07T08
T09T12
T01T03
T07
T09T12
R3R2R1
BU1 BU2 BUn
T01T03
Builder Units
PSA FarmsClock
Builder Network (10 Gbps)
Time Slot 1
Time Slot 2
Time Slot 3
Where n could range (now) from 10 to 15 according to the efficiency of the event builder algorithm and to the communication protocol used
In the final configuration (after 2005) we could imageTo have a single 10 Gbps output link and a single BU
GM, Agata Meeting, Padova, May 2003
6
Ev #Ev #Ev #
Ev #Ev #Ev #
Time Slot Assignment
MUX
Det iEv #Ev #Ev #
TTC assigns EV #(time stamp)
MUX buffers events according to a given ruleand then define the Time SlotMUX assigns a buffer # to this collection of events
Ev #
Buf #Ev #Ev #
MUX distributes the buffers to PSA farm According to their buffer #
PSA FARM
PSA farm shrink down the incoming buffers, A further buffering is then neededPSA assigns a EB (Event Builder) # to the new buffersPSA distributes the buffer to Event Builder farm According to such number.
Ev #Ev #Ev #
Ev #Ev #Ev #
Buf #
EB #Buf #Buf #
EB FARM
All
this
is s
ynch
rono
us f
or a
ll th
e de
tect
ors
Eve
nt M
ergi
ng in
the
EB
Far
m is
the
n fe
asib
le
Slic
e 1
Slic
e 2
Slic
e 2
00
Detector Slice
GM, Agata Meeting, Padova, May 2003
7
Agata Event Builder: some more requirements
R200
T01T02T03T04
T05T06T07T08
T09T10T11T12
T01T02
T05T06T07
T09
T01T02T04
T05T06
T09T10T12
T07T08
T09T12
T01T03
T07
T09T12
R3R2R1
BU1 BU2 BU3
T01T03
Builder Units
Readout FarmsClck
- delayed coincidence can span more time slots.- fragments of the same events are in different BUs.
Time Slot 1
Time Slot 2
Time Slot 3Builder Network (10 Gbps)
GM, Agata Meeting, Padova, May 2003
8
HPCC for Event Builder
Builder NetworkHigh Performance
Computing andCommunication
(HPCC)System
- High speed links (> 2 Gbps)- low latency switch- fast inter processor comm- low latency message passing
GM, Agata Meeting, Padova, May 2003
9
Agata on-line system
Builder Network
B1 B2 B20
ds1 ds2 ds3 ds4
Storage (1000 TB)
F1 F200F3F2Front-End
PSA Farm
Event Builder
HPCC builder
Tracking Farm
Data Servers
1 Gbps
100 Mbps> 1 Gbps
1000 Gbps
10 Gbps
10 Gbps
1 Gbps
R1 R200R3R2
GM, Agata Meeting, Padova, May 2003
10
Technologies for Agata On-line System
• Networking Trends • Event Builder • CPU Trends• Building blocks for the Agata Farms • Storage Systems
GM, Agata Meeting, Padova, May 2003
11
Networking Trends - I
2000 2001 2002
64 Gbps
128 Gbps
192 Gbps
Aggregate bandwidthwithin a single switch
• Local networking is not an issue. Already now Ethernet fits the future needs of the NP experiments
– link speed max 1 Gbps
– switch aggregate bdw O(100) Gbps
– O(100) Gbit ports per switch
– O(1000) FastEthernet per switch
• If HCCP is requested (e.g. Agata builder farm) options are Myrinet, Infiniband = 4 x Myrinet
10 sec latency time
Myrinet one way latency Myrinet throughput
250 MB/s
GigaEthernet
Myrinet
2003
256 GbpsInfiniband10 GbEth
GM, Agata Meeting, Padova, May 2003
12
Networking Trends II: Infiniband
Same networkto transport lowlatency ipc, storage I/O and network I/O
Internet Intranet
222 mm
110
mm
New server form factorabout 300-400 box per rack
Link speed1x 2.5 Gbps4x 10.0 Gbps.12x 30 Gbps
Router
xCA
Lin
k
Link
CPUMemCntlr HCA
CPUMemCntlr HCA
CPUMemCntlr HCA
Link
CPUMemCntlr HCA
Switch
Lin
k
Link
Link
Lin
k
LinkTCA
N/WTarget
TCA
StorageTarget
Link
Link
Lin
k
1
2
3
n
Channel basedmessage passing
1000’s nodeper subnet
1 x Link 4 x Link 12 x Link
GM, Agata Meeting, Padova, May 2003
13
Event Builder and Switch Technologies
GM, Agata Meeting, Padova, May 2003
14
CMS EVB Demonstrator 32x32
CMS
GM, Agata Meeting, Padova, May 2003
15
Myrinet EVB (with Barrel Shifter)
CMS
GM, Agata Meeting, Padova, May 2003
16
Raw GbEth EVBCMS
GM, Agata Meeting, Padova, May 2003
17
GbEth full Standard TCP/IP
CMS
CPU Load 100 %
GM, Agata Meeting, Padova, May 2003
18
TCP/IP CPU Off-loading - iSCSI
• Internet SCSI (iSCSI) is a standard protocol for encapsulating SCSI command into TCP/IP packets and enabling I/O block data transport over IP networks
• iSCSI adapters combines NIC and HBA functions. 1. take the data in block form2. handle the segmentation
and processing with TCP/IP processing engine
3. send IP packets across the IP network
ApplicationLayer
DriverLayer
LinkLayer
Network Interface CardStorage HBAFC Storage
iSCSI Adapter
IP Server FC Server IP Server
IP Packets
File Block Block
BlockBlock
IP Packetson Ethernet on Ethernet
IP PacketsFC PacketsIntel GE 1000 TIP Storage Adapter
I 80200 Processor
GM, Agata Meeting, Padova, May 2003
19
Comments on the Agata Event Builder
• Agata Event Builder is not an issue (also now). CMS experiment has already shown the ability to work with an order of magnitudine better that the Agata requirements
• Agata could work, also in the prototype, fully on standard TCP/IP
• Agata could require an hpcc based Event Builder. Technologies already exist for that, but never applied to event builder problems. Should not be a big issue– Myrinet (now)– Infiniband (soon)
GM, Agata Meeting, Padova, May 2003
20
Processors and Storage trends
2007 CPU = 250 SI952010 CPU = 700 SI95
Year 20071 disk = 1 TByte
80 SI95 Now 250 GB Now
GM, Agata Meeting, Padova, May 2003
21
Building Blocks for the Agata Farms
Year SI95 x Box
2004 200
2007 500
2010 1500
1 U CPU Box with 2 processors40 Boxes x Rack
2004
1 Detector
2007
15 Detectors
2010
200 Detectors
Nr. Boxes
Nr. Racks
Nr. Boxes
Nr. Racks
Nr. Boxes
Nr. Racks
PSA Farm35 1 200 5 1000 25
Builder Farm- - 2 1/20 10 1/4
Track. Farm
No GLT - - 40 1 200 5
Track, Farm
GLT - - 4 1/10 20 1/2
Configurations
Farms Type
GM, Agata Meeting, Padova, May 2003
22
SW2
Blade Based Farms1 Blade Box with 2 processors14 Boxes x crate (7 U)6 Blade crates x rack = 108 BoxesPower = 16 KW x Rack
2004
1 Detector
2007
15 Detectors
2010
200 Detectors
Nr. Blades
Nr. Racks
Nr. Blades
Nr. Racks
Nr. Blades
Nr. Racks
PSA Farm 35 1/3 200 2 1000 10
Builder Farm- - 2
< crate
10<
crate
Track. Farm
No GLT - - 40 2/5 200 2
Track, Farm
GLT - - 4
< crate
20 1/5
Configurations
Farms Type
backplane
SW1
30 Gbps backplane
2 x 4 x 1 Gbps uplinks
GM, Agata Meeting, Padova, May 2003
23
On-line Storage• On-line Storage needs
– 1-2 weeks experiments– Max 100 TByte / experiment (no GBT)– Max 1000 TByte/year– 2010 1 disk = 4 Tbyte– Storage Agata System: 250 disks (+ 250 for mirroring)
• Archiving– O (1000) TB per year can not be handled as normal flat files
– Not only physics data stored• run conditions• Calibration
– Correlation between physics data, calibration and run condition are important for off-line analysis
– Data Base technology already plays an important role in physics data archiving (Babar, LHC experiments, etc.). Agata can exploit their experience and development
GM, Agata Meeting, Padova, May 2003
24
Storage Technologies Trends Application Servers
Data Servers
gateway
SAN enabled disk array
GEth/iSCSI
Infiniband
Commodity Storage Area Network share all the farms nodes. Technololgies interested for us are:- iSCSI over Giga (or 10 Giga) Ethernet- InfinibandFull integration between the SAN and the farm is realized if a Cluster File System is used.Example of Cluster File System are:- LUSTRE (www.lustre.org)- STORAGE TANK (IBM)
GM, Agata Meeting, Padova, May 2003
25
Example of a iSCSI SAN available today
GEth/iSCSI
Application Servers
Data Servers
Host adapters:- Intel GE 1000 T- Adaptec ASA-7211- LSI 5201- ecc.
2 x GE
LSI logic iMegaRAID
SATA
1
16
= ~ 5 Tbyte x controller
iSCSI ControllerRAID – SATA Controller
SATA = Serial ATA
GM, Agata Meeting, Padova, May 2003
26
Data Archiving
InternetIntranet
Low latency Interconnection (e.g. HPCC)
Data Servers
Storage Area Network
InputLoad balancing
switch
Shared Data Caching(Oracle)
Scalability
GM, Agata Meeting, Padova, May 2003
27
Run Control and Slow controlFront-end electronic and pre-processing
Pulse Shape Analysis
Event Builder
Tracking
Run Control andMonitorSystem
Slow Control
Storage
GM, Agata Meeting, Padova, May 2003
28
Run Control and Slow Control Technological Trends
• Virtual Counting Room
• Web based technologies– SOAP– Web Services and Grid Services (Open Grid Service Architecture)– Data Base
• Demonstrators in operation at CMS test beams facilities
GM, Agata Meeting, Padova, May 2003
29
RCMS present demonstrators
SOAP
Java TomCatContainers
OrGrid Services
MySQL
Java
GM, Agata Meeting, Padova, May 2003
30
Slow Control Trends• Ethernet every where
• Agata could be fully controlled by Ethernet connections, including the front end electronics
• This lead to have an homogeneous network avoiding the use of bridges between busses, software drivers to peform the bridging, etc.
• Embedded web server and embedded java virtual machines on the electronics
• Embedded Java should guarantee an homogeneous development environment, portability, etc.
Tini system
Xilink Virtex II Pro
GM, Agata Meeting, Padova, May 2003
31
Agata Demonstrator (2003-2007)
Builder Network
B1
F1 F15F3F2Front-End
PSA Farm
Event Builder
HPCC builder
Tracking Farm
Data Servers
1 Gbps
P1 P15P3P2
T1 T2
Blade Center
Blade Center
15 x 2 Eth Switch
2 Dual processor Servers + Myrinet
B2
Storage Area network = iSCSI
1 16 1 16
SAN iSCSI Disk Array + SATA disks
8 + 8 TByte
100 Mbps
GM, Agata Meeting, Padova, May 2003
32
Off-line InfrastructureData Production Center:- on-line system- online storage (1000 TByte)- central archive ?
Regional Computing Facilities:- computing power for own analysis- on-line storage for 1-2 experiments- local archive
> = 10 Gbps links
• Exploit the LHC off-line infrastructure based on Regional Computing centers• Regional Computing facilities (e.g. Country bounded) are linked to the data
center via 10 Gbps links.• All the computing facilities are based on PC farms• a typical experiment take about 1 day to copy the entire data set• no tape copy
GM, Agata Meeting, Padova, May 2003
33
World Wide Farming: the GRID– GRID is an emerging model of large distributed computing. Main aims are:
– Transparent access to multi-petabyte distributed data bases– Easy to plug in– Hidden complexity of the infrastructure
– GRID tools can help significantly the computing of the experiments promoting the integration between data centers and the regional computing facilities
– GRID technology fits the Agata experiment off-line requests
HEP focused GRID initiatives - DataGrid (UE) - GriPhyN (USA)- PPDG (USA)
GM, Agata Meeting, Padova, May 2003
34
Summary• Fully adoption of digital electronics
– increased on-line computing power needed to perform pulse shape analysis• Digital Signal Processors (DSP) embedded in the front-end electronics• Commodity components like PCs and networking devices
• Trigger less systems (dead time free)– time stamp techniques are used to have dead time free systems– on-line computing power is needed to correlate data by applying prompt or delayed coincidence
(event building)
• On-line analysis– tracking systems needs O(105 ) SI95
• Storage– O(100) MB/s on the storage devices– Use of data bases for archiving. Advanced parallel server needed to follow the rate
• Off-line analysis using GRID techniques– data storage O(1000) TB per year– international collaborations
• data distributions• regional computing centers
– GRID architecture
GM, Agata Meeting, Padova, May 2003
35
Conclusions• No fundamental technological issues for the final Agata on-line system:
– The experiment requirements and the present understanding of the PSA algorithm fit with a final (2010) moderate size ( O(1000) machines) on-line system. Only a 3 times improvement in the PSA calculations lead to a system much more manageable (3 racks).
– Both network and event builder issues already fit with the today available technologies.
– Storage requirements (1000 TByte) fit with the evolution of the storage technologies.
– On-line storage staging, high bandwidth network data transfer and GRID technologies allow data distribution over WAN ; tape only for backup.
• Demonstrator– Same architecture of the final system, only scaled down to the foreseen number of
detector.