Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Trigger Primitive Generation Chain
Status for June DAQ Sprint
Kunal Kothekar,
04/06/2019
Jun 4, 2019 2Kunal Kothekar, University of Bristol
SP DAQ Front-end firmware: Scope● The front-end system is expected to perform data-compression and trigger
primitive generation.
● Acts as a co-processor of evolving FELIX hardware with a daughter-card mounted FPGA.
● Also expected to perform data buffering for supernova trigger.
Jun 4, 2019 3Kunal Kothekar, University of Bristol
Plan for June DAQ● Test a preliminary prototype of Front end DAQ chain at protoDUNE from June 11th
( For two weeks )
● We will be using ipbus framework to test out a single fiber data via WIB and ZCU102 evaluation board.
● The outline involves getting protoDUNE data on ipbus, unpacking, reordering, sending out to main Trigger Primitive Generation (TPG) block, sending out to o/p buffer for offline processing.
● A validation framework in simulation is also in the place, which will verify output of firmware at every stage w.r.t it’s software counterpart.
● The objectives for this test are,1. Getting hardware ready processing blocks.2. Getting to know test area at protoDUNE for future.3. Testing actual data on the front end DAQ chain.4. Testing data protocols for input and output.5. Debugging, resource utilization reports and chalk-out a future plan.
Jun 4, 2019 4Kunal Kothekar, University of Bristol
Specifications for June DAQ
Jun 4, 2019 5Kunal Kothekar, University of Bristol
TPG Chain: Processing block
Jun 4, 2019 6Kunal Kothekar, University of Bristol
Data Formats for TPG Chain : I/P● AXI4 based streaming interface after unpacking and reordering of the data.
● 16 bit length of adc sample.
● A Packet will consist of 64 such adc samples ( data frame ) and a header with a time stamp, channel no., and other flags ( header frame )
● A ‘tuser’ flag which goes high at the end of every frame ( data, header ).
● A ‘tlast’ flag which goes high at the end of the packet ( header+data )
● A ‘tready’ flag to handle the back-pressure.
Jun 4, 2019 7Kunal Kothekar, University of Bristol
Data formats for TPG Chain: o/p● The output to to arbitrator ( dtpc arb/ FIFO ) will be of the format,
● Case I ( a hit in a packet )→ Header frame + Word 0 + Word 1 + Word 2 + Word 3 + Word 4 + Word 5
● Case II ( one or more hit in a packet )→ Header frame + Word 0 + Word 1 + Word 2 + Word 3 + Word 4 + Word 5 + Word 0 + Word 1 + Word 2 + Word 3 + Word 4 + Word 5 + …
● Word 0 (HitStart), Word 1 ( Hit end), Word 2 ( Hit Peak), Word 3 ( Peak time ), Word 4 ( Hit Sum ), Word 5 ( Hit Continue + other flags )for more info:Output_interface.pdf
● A Hit continue bit which indicates weather or not a hit spilled pover data frame of 64 words, if a hit is self contained in the packet it will have a value of a 0, and if the hit has spilled over it will have a value of 1.
● With a tlast and tuser indicating ends of frame and packets respectively.
● Back-pressure handling with tready.
Jun 4, 2019 8Kunal Kothekar, University of Bristol
Project Status
DGC / JB : WIB Board ZCU 102 linking test ongoing.Fransesco: WIB reading ongoing.DGC: Block buffer almost ready.Expected Completion (?)
Jun 4, 2019 9Kunal Kothekar, University of Bristol
Project Status
Kunal & Kostas: Almost ready, all firmware written debugging and tests going on.Expected completion: Friday or Monday
Jun 4, 2019 10Kunal Kothekar, University of Bristol
Project Status
DGC & DN: Key decision pending about back-pressure and FIFO reading. Example firmware is written.Expected completion: ?
Jun 4, 2019 11Kunal Kothekar, University of Bristol
Project Status
A Thea / K Harder : IP Bus framework ready.
Jun 4, 2019 12Kunal Kothekar, University of Bristol
Summary● Gearing towards June DAQ test.
● Compression and buffer management blocks are also ready, most probably will be included in July DAQ test.
● Please keep in touch at,
● DUNE - Upstream FPGA hell (dune-uk.slack.com/#fpgahell)
● https://gitlab.cern.ch/DUNE-SP-TDR-DAQ
●
https://wiki.dunescience.org/wiki/SP_FW_Development
Jun 4, 2019 13Kunal Kothekar, University of Bristol
Back-up
Jun 4, 2019 14Kunal Kothekar, University of Bristol
Firmware project status
● The blocks are developed towards implementation with ZCU102 eval board.
● An independent parallel software simulation chain is in development to verify the output of the final design at Bristol.
● A test strategy for various blocks and for completed design is in place.
Block Name Developer Status
Simulation IntegrationHardware
Implementation
Hit Finding Bristol ✔ ✔ ✘
Pedestal / Filter RAL ✔ ✔ ✘
Compression Imperial ✔ ✘ ✘
Buffering UCL ✔ ✘ ✘
Prototype Wrapper RAL / Bristol ✔ ✘ ✘
Jun 4, 2019 15Kunal Kothekar, University of Bristol
Filter/Pedestal subtraction● A chain of filter+pedestal subtraction has been developed and tested in
simulation at RAL.
● FIR Filter:
● Pedestal Subtraction:
→ If ADC value > ( < ) pedestal : accumulator + ( - ) = 1
→ If accumulator == + ( - ) 10 : pedestal + ( - ) = 1
● Complete firmware simulated, a detailed report on status to follow.
y[n] is the o/p signal
x[n] is the i/p signal
N=16 is the filter order
bi is the i’th coefficient of the filter
Jun 4, 2019 16Kunal Kothekar, University of Bristol
Hit finding
● Start from pedestal-subtracted ADCsamples, si
● A hit is recorded when the ADC value goes above pre-defined threshold and lasts till it drops below the threshold.
● Complete firmware simulated, a detailed report on status to follow.
Quantities to be stored in trigger primitive?
- t0 start time- tf end time- peak time- peak amplitude- sum of Si over the hit period
Jun 4, 2019 17Kunal Kothekar, University of Bristol
Filter/Pedestal Subtraction+Hit finding: Status● Filter+pedestal subtraction/Hit finding algorithm has been simulated and
synthesized successfully.
● The resource utilization estimates ready. ( Looks satisfactory )
● Developers at RAL and Bristol have implemented TPG for a single channel.
Jun 4, 2019 18Kunal Kothekar, University of Bristol
Filter/Pedestal Subtraction+Hit finding: StatusSingle event TPG chain Validation
Jun 4, 2019 19Kunal Kothekar, University of Bristol
Compression● Fibonacci encoding via LUT implemented in firmware and simulated
→ Encode delta from previous sample so most entries are zero
→ Compression factor depends on noise profile of samples
● Resource estimate suggests ~1% of Block RAMs used per compression stream → 40 instances should fit on the targeted FPGA.
Jun 4, 2019 20Kunal Kothekar, University of Bristol
Buffer Management● Memory management block has a
40 channel input
● Each input channel has an associated FIFO (currently blockRAMS - not Xilinx FIFO core)
● Input and output sides clocked by 200 and 300 MHz respectively
● 16-bit wide write side, 64-bit wide read side
● Each blockRAM has a size of 16-bit x 512
● Another BlockRAM structure aggregates the writes to DDR4.
Stream to memory block
Jun 4, 2019 21Kunal Kothekar, University of Bristol
Buffer Management
● An addition has been made tothe input stream packet:
● 32-bit magic word to indicate astart of a packet from a channel
● 16-bit length value follows the magic word
● Incoming packet header followsthat and is also written to DDR.
● KCU105 board used to evaluateresource usage (Kintex Ultrascale FPGA)
Packet structure
BlockRAM resource usage
Jun 4, 2019 22Kunal Kothekar, University of Bristol
Wrapper/Block Management Framework● A self consistent prototype of wrapper is in place; a demonstrated example
wrapper with IPBUS is on our twiki.
● Header stripper receives data for set of wires associated with processing block.
● "RAM" needed Nwords x Nchannels ( will be located inside header stripper )
● At start of new wire, header stripper sets flag. Saves state, writes state
● Communication with blocks→ Clock ( freq. same throughout system. e.g. 200MHz)→ Reset→ Data in/out AXI4-stream ( Same flags e.t.c. as input and output channel , but algo never sees headers. Just see straight data with no headers)→ Configuration - 16 bit data , N-bits address. In, out. WE , RE. Same as Xilinx RAM. One cycle latency for read / write
● A fully working example is work in progress.
Jun 4, 2019 23Kunal Kothekar, University of Bristol
Outlook● All the blocks of firmware are simulated/synthesized. ✔
● Preliminary reports on resource estimates usage by each block looks satisfactory and within reach. ✔
● A chain is in works involving wrapper, filter/pedestal and hit finding. ✔
● A parallel chain in software simulation. ✔
● A single channel realization of front-end on zcu102 ( expected very soon )
● Integration of block management system with compression and supernova buffer management. (Next step ...)
● Multichannel multiplexing front end firmware with state saving. ( Next step ...)
● An extensive test plan for front end firmware is in place. ✔
Jun 4, 2019 24Kunal Kothekar, University of Bristol
Summary● Overall the front end firmware realization is in good shape.
● There are multiple zcu102 evaluation board available for developers now.
● Next 2-3 months are crucial to establish an effective chain of blocks.
● Work on documentation has already begun.
● We are making a steady progress.
● Follow us at,Single Phase Front End Firmware Twiki
Gitlab working areahttps://gitlab.cern.ch/DUNE-SP-TDR-DAQ
Jun 4, 2019 25Kunal Kothekar, University of Bristol
Resource usage
Jun 4, 2019 26Kunal Kothekar, University of Bristol
Resource usage
Jun 4, 2019 27Kunal Kothekar, University of Bristol
Specifications in detail
Jun 4, 2019 28Kunal Kothekar, University of Bristol
Fast Hit Finder
● Start from pedestal-subtracted ADCsamples, si
● A hit is recorded when the ADC value goes above pre-defined threshold and lasts till it drops below the threshold.
timets tftp
P
C
Si
● What we intend to store in trigger primitive?
- t0 start time- tf end time- peak time- peak amplitude- sum of Si over the hit period
● Integrate several fixed-size windows before and after pulse.
● Choose which to include in charge estimation later
● Optimize size/Number of windows with simulation.
Δtt
fΔtt
Δtt
AD
C, s
Jun 4, 2019 29Kunal Kothekar, University of Bristol
Status (1/3)
● A modified wrapper (I/O, buffering etc.) is already present.
● Version 1 of simulation is ready (questa -sim)
● The Algorithm has now been synthesized in Vivado and ready for review and further iterations.
Jun 4, 2019 30Kunal Kothekar, University of Bristol
Status (2/3)● sample files from framework with truth information are available, we also
have simulated some sample files with exactly one hit.
● VHDL Simulation status
Test bench- will read sample files
Algorithm
Test bench- will write output in readable format
Hit Finder Module
v1 ready.
Jun 4, 2019 31Kunal Kothekar, University of Bristol
Status (3/3)● Firmware side
- To complete a simulation of Hit finding module. ✔
- Writing a test bench. ✔
- Iterate the design for optimization. ( work in progress ...)
- Synthesis and implementation in vivado. ✔
- Resource estimate (we have initial pointers.) ✔
- Combining the HFA with Dave’s wrapper. ( work in progress ...)
● Software side
- Writing a software module for a hit finding algorithm
- Analyzing various samples.
● Finally comparing the software and firmware implementation
Jun 4, 2019 32Kunal Kothekar, University of Bristol
Highlights of the code (1/2)● Threshold timing analysis
● There are 3 flags here, first is “inhibit” it acts as inhibitor which effectively breaks the process to prevent continued stuttering and invalid o/p.
● in_hit_v remains high till adc pulses remain above threshold.
● hit_ready_v gets high after the pulse dives below threshold.
● The code snippet describes the single threshold timing analysis.
● The saving of the current clock time (accsec) into eventprint(0-2) is triggered at the first time incoming adc pulses exceeds the given threshold value.
● When the adc pulses dives below the threshold again the time is stored, in eventprintread_v variable.
Jun 4, 2019 33Kunal Kothekar, University of Bristol
Highlights of the code (2/2)
Peak value, peak time and sum of adc values
● The running sum is recorded for an entirety of the heat and made available when hit_ready_v goes high.
● The code snippet describes the logic for finding the hit peak, peak time and sum of adc pulses within hit duration.
● The current value of hit is saved and compared with the incoming value and the time is recorded.
Jun 4, 2019 34Kunal Kothekar, University of Bristol
Test Bench Output
Simulation output of HFA, tested with an example waveform file consisting increasing adc values to imitate actual waveform,
Hit start time = 15Hit end time = 25Hit peak = 567Hit peak time = 22Hit sum = 5633Output ready = 1
●
Jun 4, 2019 35Kunal Kothekar, University of Bristol
Synthesis on Vivado 2017.2● Synthesized the code on Vivado 2017.2, target board KCU105
● RTL schematic ( elaborated design )
Jun 4, 2019 36Kunal Kothekar, University of Bristol
Synthesis● Synthesized schematic
Jun 4, 2019 37Kunal Kothekar, University of Bristol
Utilization report
Jun 4, 2019 38Kunal Kothekar, University of Bristol
Resource Estimation ( Discussion )● It is clear from slide 12 that, the current algorithm takes a very low
resources.
● For 960 collection plane wires assuming the multiplexing of 100 is to 1, the LUT’s utilized will be, 5140 out of 242400 available.
● This port utilization is currently high as all the i/p and o/p ports are implemented as 32 bit integer’s.
● If multiplexing we have to also check the memory utilization.
● One more point that came up while I was discussing this with DC, about the “state saving”, which we can discuss in our group meeting.
● These are very primary estimates and as per my understanding, we can have more round-out discussion.
Jun 4, 2019 39Kunal Kothekar, University of Bristol
Update● After making v1 of the block, currently I am working on,
A→ A manipulator which takes Phil’s file and convert it to required data format and inserts pseudo header.
B→ Developing a test-bench which performs following tasks,
→ Takes the data format produced by the manipulator, strips the header.
→ Stores the header temporarily and process the data block of 256 samples. → send the 256 samples to the HFA sequentially, find hit parameters.
→ Produce a output text file with header and hit parameters, along with a hit continue signal.
C→ Developing a next block, which takes this output text file and put it into a data word format.
Jun 4, 2019 40Kunal Kothekar, University of Bristol
A. Manipulator ( Converting Phil’s File)● A simple python script is ready which does following task, ✔
● An agreed v1 of header format looks like,
Header Format/ Streaming Interface
Event# channel# collection? adc0 adc1 adc2 adc3 ..
Event# channel# collection? adc0 adc1 adc2 adc3 ..
Phil’s File
header1 Header2..header_n adc0 adc1 Adc2 ...adc255header1 Header2..header_n adc0 adc1 Adc2 ...adc255
Script
Jun 4, 2019 41Kunal Kothekar, University of Bristol
B. Test-bench● A v1 test-bench is ready which reads from a header-less file and produces
hit parameters.✔
● The new proposal ( Partially ready/work in progress ...)
header1 Header2..header_n adc0 adc1 Adc2 ...adc255header1 Header2..header_n adc0 adc1 Adc2 ...adc255
header1 Header2..header_n
adc0 adc1 Adc2 ...adc255 HFA
Hit parameter 1Hit parameter 2Hit parameter 3..Hit Parameter n
OUTPUT
Combine
STRIP
Jun 4, 2019 42Kunal Kothekar, University of Bristol
B. Test-bench● Output format in *.txt format
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Jun 4, 2019 43Kunal Kothekar, University of Bristol
C. Data Word Writer/Packetizer
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
Header 0 Header 1 … Header n Hit parameter 1 Hit parameter 2 … Hit Continue
PACKETIZER
16 bit word stream
Completely Synthesizable block
● This is a proposed step, which will produce a stream of 16 bit data words/packets in a pre-fixed format. ( Not started yet ...)
Jun 4, 2019 44Kunal Kothekar, University of Bristol
Update● The output is ready.
● Consider following dummy cases,
A. No Hit Scenario
B. One hit in a packet
C. Overlapping hit
D. two or more hit in a packet.
Jun 4, 2019 45Kunal Kothekar, University of Bristol
Jun 4, 2019 46Kunal Kothekar, University of Bristol
Jun 4, 2019 47Kunal Kothekar, University of Bristol
Jun 4, 2019 48Kunal Kothekar, University of Bristol
Jun 4, 2019 49Kunal Kothekar, University of Bristol
Jun 4, 2019 50Kunal Kothekar, University of Bristol
Summary and next steps● This test-bench and the formatted file can act as stand-alone tester for
various co-processor blocks.
● Next agenda is to integrate this block with Dave’s wrapper.
● Establishing communication of HFA with filter from Kostas, with this test bench.
● Multiplexing is a next stage, which will also come with state saving requirement.
● This test bench combined with filter can act as v0 demonstrator of co-processor.