17
Implementation of the STAR Data Acquisition System using a Myrinet Network J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz (Brookhaven National Laboratory) J.M. Nelson (University of Birmingham) C. Adler, J.S. Lange (University of Frankfurt)

Implementation of the STAR Data Acquisition System using a Myrinet Network

  • Upload
    kolya

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Implementation of the STAR Data Acquisition System using a Myrinet Network. J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz (Brookhaven National Laboratory) J.M. Nelson (University of Birmingham) C. Adler, J.S. Lange (University of Frankfurt). First Collisions at RHIC!. - PowerPoint PPT Presentation

Citation preview

Page 1: Implementation of the STAR Data Acquisition System using a Myrinet Network

Implementation of the STAR Data Acquisition System using a Myrinet

Network

J.M. Landgraf, M.J. LeVine,

A. Ljubicic, Jr., M.W. Schulz(Brookhaven National Laboratory)

J.M. Nelson(University of Birmingham)

C. Adler, J.S. Lange(University of Frankfurt)

Page 2: Implementation of the STAR Data Acquisition System using a Myrinet Network

First Collisions at RHIC!

Star Control Room June 12th, 2000 9:00pm

Page 3: Implementation of the STAR Data Acquisition System using a Myrinet Network

Outline

• The STAR DAQ System– Components– Event Building Network

• Introduction to Myrinet

• Myrinet Implementation– Myrinet Software (GM)– STAR DAQ Software– myriLib

• Year 2 Event Builder

• Performance & Reliability

Page 4: Implementation of the STAR Data Acquisition System using a Myrinet Network

STAR DAQ

DAQ Readout Units– VME Crate-Based – Custom RBs with ASICs & i960 CPUs– Motorola MVME Detector Broker

L3– Linux Farm (Compaq Alpha workstations)– Physics based build decision

Event Building Network– Token Management– Event Building– Event Storage and Buffering

Page 5: Implementation of the STAR Data Acquisition System using a Myrinet Network

DAQ / L3 Event Building Network

Squares: MVME / VxWorksCircles: Alpha Workstations / Linux

Diamonds: Ultrasparc Workstations / Solaris

Page 6: Implementation of the STAR Data Acquisition System using a Myrinet Network

What is Myrinet?• Commercial Network From Myricom

(www.myri.com)

• Low cost (~$1K / Card, $4-6K / Switch) • PCI / PMC Network Interface Cards• High bandwidth (1.28 + 1.28 Gb/sec)• Low Latency (13 usec)• Scalable switched topology• Network control performed in software• Open-source MCP / Driver software

Page 7: Implementation of the STAR Data Acquisition System using a Myrinet Network

Myrinet Architecture

• Network Card Interface (PCI64B)

– Lanai processor controls network– Local memory buffer– Both network & PCI DMA engines

• Switches

– Cut-through wormhole routing– CRC is recalculated at each stage

Including header– Stop/Go flow control mediated with

Small slack buffer

Page 8: Implementation of the STAR Data Acquisition System using a Myrinet Network

Myrinet Throughput

We Tested: 32 / 64 bit Myrinet cards VxWorks MVME 2604, MVME 2306 Linux Compaq Alpha Linux Intell Solaris Ultrasparc

Page 9: Implementation of the STAR Data Acquisition System using a Myrinet Network

Myrinet Software Network mapping

• Each myrinet node maintains list of port offsets to each other node

• Dynamic and Static mapping supported• Alternate routes can be forced by user

Myrinet driver (GM)• Variable length Messages

–Sender / Receiver provide buffersin advance for each size

– Sender / Receiver notified and mustreturn buffer to gm

• Directed Sends– DMA directly to host memory– Receiver not notified• GM imposes structure on user program– Poll / Block on gm_receive()– GM is not thread-safe

Page 10: Implementation of the STAR Data Acquisition System using a Myrinet Network

DAQ SoftwareSoftware is Message Based

for(;;){ msgQReceive(&msg); switch(msg.cmd) { }}

Sending is routed to the proper network

Each network has an associated daemon

daqMsgSend(node, &msg)

node/task/domain Local QueueMyrinetEthernetVME

myriLibethComLibvmComLib

que[task]

ICCP Message Protocol• 120 byte messages• Standard header

Page 11: Implementation of the STAR Data Acquisition System using a Myrinet Network

myriLib

DAQ library which wraps gmmyriMsgSend()myriMemCpy()

What does it do?• Manages the DMA message buffers • Handles callback functions• Thread synchronization• Misc… (Byte order, 32 vs. 64 bit etc.)• Bypasses DMA limitations on Solaris

Several Flavors• Threaded vs Process• Buffered vs Unbuffered DMA copies

Page 12: Implementation of the STAR Data Acquisition System using a Myrinet Network

myriLib OperationsThreaded (VxWorks tasks) myriLib

These lead to extra latency/reduced throughputfor directed sends

Process myriLib with Buffering

Page 13: Implementation of the STAR Data Acquisition System using a Myrinet Network

myriMemCopy() Throughput

Page 14: Implementation of the STAR Data Acquisition System using a Myrinet Network

Multi-Sender myriLib Throughput

32-bit card MVME 2306 senders64-bit Ultrasparc receiver

Page 15: Implementation of the STAR Data Acquisition System using a Myrinet Network

Year 2 Event BuildingSolaris Myrinet Cards allow us to implement the EVB on the BB Node

– Removes a node from the networkSimplifies SoftwareReplaces point-to-point transfer with many-to-point transfer

– More Memory (1.5GB vs. 256 MB)Simplifies SoftwareThroughput increase via multiple pftp streams (30-35 MB/Sec vs. 25 MB/Sec)

– Multi-CPU Ultrasparc MachineCompression on Built Events?

Preliminary Results Show– Improved Small Event Performance

(25 evts/Sec 140 evts/Sec)

– Improved Throughput to BB(28 MB/Sec 100 MB/Sec)

Page 16: Implementation of the STAR Data Acquisition System using a Myrinet Network

Year 1 Performance & Reliability

RHIC Data Run 3 Months Data Taking ~15 Days Integrated Stable Beam Little down time due to DAQ

STAR Performance ~10 TB data ~2.03 Million Events

Myrinet Performance4 known message failures (>108)

– Cause not known– Reported by software– Resulted in aborted run

No known data corruption

Page 17: Implementation of the STAR Data Acquisition System using a Myrinet Network

Au-Au Central Collision

130 GeV Au-Au Collision viewed through the L3 Event DisplaySeveral thousand tracksTracking in real time (~100 msec)