Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Flexible X-Ray Image Processing on GPUsMatthias Vogelgesang, Suren Chilingaryan and Andreas Kopmann
X-Ray Tomography at Synchrotrons
Image Processing Toolkit
X-ray tomography is a non-destructive technique to analyze and visualize the inner structure of organisms and materials in three and four dimensions.
• Core written in C• Uses hardware-dependent OpenCL kernels• Python and JS bindings• Implicit buffer management• Multi-threading and implicit synchronisation using asychronous queues • Automatic node-to-GPU mapping using heuristics
from gi.repository import Ufo
g = Ufo.Graph()for f in ['reader', 'writer', \ 'fft', 'ifft', 'ramp']: globals()[f] = g.get_filter(f)
bp = g.get_filter('backproject')fft.set_properties(dimensions=1)reader.connect_to(fft)fft.connect_to(ramp)ramp.connect_to(ifft)ifft.connect_to(bp)bp.connect_to(writer)
g.run()
High pixel resolutions, bit depths and frame rates of state-of-the-art cameras increase the size of a single data set to several gigabytes. In order to process sequences of recon-structed volumes during acquisition, GPU processing is in-evitable.
Our software toolkit is used for pre-processing, image re-construction and post-processing in online and offline environ-ments. Most algorithms can be expressed as a composition of smaller sub-operations such as convolutions, filters and complex arithmetics. Thus, we map each of the processing stages to a computation graph representing the full algorithm. Each node in the graph is connected using a queue that is used to push processed data to its successor and scheduled according to its workload.
Some of its key features are:
ReferenceChilingaryan, Mirone, Hammersley, Ferrero, Helfen, Kopmann, dos Santos Rolo and Vagovic. “A GPU-Based Architecture for Real-Time Data Assess-ment at Synchrotron Experiments” IEEE Transactions on Nuclear Science, vol. 58, pp. 1447–1455, Aug. 2011.
Results
Tesla S1070
0 200 400 600 800 1000 1200
1099
269
75
64
38
Reconstruction time in seconds
2x GTX 295
2x Xeon E5472
1x GTX 280
4x GTX 580
1 2 3 4
Number of GPUs
200
400
600
800
GFL
OPS
Scalability of Back-Projection on Tesla S1070Time for reconstructing an 11 GB 3D Volume from 2000 Projections (~ 24 GB)
Acquisition of Projections
Example: Reconstruction Graph
Sinogram Generation
FFT Filter IFFT
Back-Projection onone or more GPUs
Storage Segmentation/Meshing
Except for acquisition and storage, each node is executed on one of theavailable GPUs according to a heuristic.
Hardware Setup
100 fps, 5.5 Megapixel
330 fps, 2.2 Megapixel
Sychrotron Radiation10–65 keV Energy range Data Processing
w/ up to 6 GPUsHadoop Storage Cluster2 PB disk + 2.5 PB tape
Online evaluation allows us to adjust motor and acquisition parameters
Truelinearity