13
SIMDized cluster transformation for ALICE HLT Sergey Gorbunov

SIMDized cluster transformation for ALICE HLT

  • Upload
    viho

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

SIMDized cluster transformation for ALICE HLT. S ergey Gorbunov. ALICE HLT: Reconstruction in TPC tracker . TPC sector 0. TPC sector 35. . The Cluster Transformation applies calibration and alignment to the TPC clusters. It was the show stopper for HLT: - PowerPoint PPT Presentation

Citation preview

Page 1: SIMDized  cluster transformation  for  ALICE HLT

SIMDized cluster transformation for ALICE HLT

Sergey Gorbunov

Page 2: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 2/13

ALICE HLT: Reconstruction in TPC tracker

TPC sector 0

Cluster Finder

ClusterTransformati

on

Sector Tracker

TPC track merger

TPC sector 35

Cluster Finder

ClusterTransformati

on

Sector Tracker

The Cluster Transformation applies calibration and alignment to the TPC clusters.It was the show stopper for HLT:

off-line code under development, speed is out of the HLT control

getting more and more complicated suddenly became 10 times slower

CPU, slow

FPGA

GPU

CPU, fast

Page 3: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 3/13

ALICE HLT: Transformation of TPC clusters

TPC cluster transformation: (Pad, Time)->(X,Y,Z)

TPC sector

Readout pads

Tim

e bi

n

p0 p1

t0

t9

row k

Row number -> XPad -> YTime -> Z

Page 4: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 4/13

ALICE HLT: New composed transformation

1. TPCExBShape2. alignGlobal3. alignLocal4. alignQuadrant5. FCVoltError3D6. FitBoundary7. FitExBTwist8. FitAlignTPC9. FitRocAlignZSum… there will be more

Composed TPC transformation:

Too many different transformations – no hope to optimize the code ...

Speed: 7.9 ms/cluster = 6 s/event (PbPb)

Page 5: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 5/13

ALICE HLT: Cluster transformation for particular TPC pad

Final transformation: It does not look that complicated:

X

Y Z

X zoomed

Sec=0, Row=0, pad coordinate in [30,31]

Page 6: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 6/13

ALICE HLT: Fast transformation

Idea: Fit the original Transformation with polynoms/splines:

Initialization phase: Create temporary splines by calling TPCTransform() : - no matter which corrections are applied, - no matter how slow it isRunning phase: Use of the splines: - few arithmetic operators --- very fast - transformation time is constant: no more dependence on other code and data base objects

Page 7: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 7/13

ALICE HLT: Fast transformation

Idea: Fit the original Transformation with polynoms/splines:

Requirements:- Accuracy: 100 um- Speed: 10 times faster - Transformation for 33 rows should fit to CPU cache- Use of SIMD vectors (Vc package) for faster

calculations

Size of cluster, Y Size of cluster, Z

Page 8: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 8/13

ALICE HLT: Fast transformation: Polynomial fit

Fit with polynoms: Many local features- does not work well:

Difference with polynom of 0 order [cm]:

Difference with 5-th order polynom [cm]:

X

X

Y

Y Z

Z

Page 9: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 9/13

ALICE HLT: Fast transformation: Fit with splines

Fit with splines: Accuracy about 0.1 um for X,Y,Z

X

Worst deviation in X: 88um, in Y: 81um, in Z: 3.8um

Page 10: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 10/13

ALICE HLT: Fast transformation

Fast HLT transformation: TPC row with the worst deviations

(Orig. - Fast transformation)[um] vs Pad and Time

Page 11: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 11/13

ALICE HLT: Fast transformation

Fast HLT transformation: TPC row with the worst deviations

This particular TPC row seems to be miscalibrated to 140um -> accuracy of the Fast Transformation [80um] here is not worse than the accuracy of the original transformation [140um].

140 um gap120 um gap

Original X vs pad Original Y vs pad

Page 12: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 12/13

ALICE HLT: Parallel algorithm (Cellular Automaton)

Fast HLT Transformation: Results

- Accuracy: 0.1 um- Speed-up in scalar: 58.1 times - Speed-up with SIMD vectors: 83.6 times (SIMD give factor 1.44 extra speed-up) - Transformation size for 33 rows: 170kB ( full TPC:

29MB)Use of SIMD vectors:

- Vc package from M.Kretz – very useful tool - Spline points are stored as 4-vectors: (x,y,z,*)

- This scheme allows one to avoid packing of data to vectors, but it only works with SIMD vectors of size 4.

Page 13: SIMDized  cluster transformation  for  ALICE HLT

Sergey Gorbunov, FIAS30 November 2012, CERN 13/13

ALICE HLT: Summary

Parallelization in ALICE HLT:• All time-consuming HLT components are runningnow on parallel hardware:

- cluster finder: FPGA- cluster transformation: SIMD (via Vc package)- sector tracker: GPU

• Fast transformation gives factor of 83.6 times speed-up • Vc package installed and can be used offline as well•