39
Rapid Diagnosis of Acute Rapid Diagnosis of Acute Heart Disease by Cloud- Heart Disease by Cloud- based High Performance based High Performance Computing for Computer Computing for Computer Vision Vision Oleksii Morozov Physics in Medicine Research Group University Hospital of Basel Switzerland April 8, 2010

Rapid Diagnosis of Acute Heart Disease by Cloud-based High Performance Computing for Computer Vision Oleksii Morozov Physics in Medicine Research Group

Embed Size (px)

Citation preview

Rapid Diagnosis of Acute Heart Disease Rapid Diagnosis of Acute Heart Disease by Cloud-based High Performance by Cloud-based High Performance Computing for Computer VisionComputing for Computer Vision

Oleksii MorozovPhysics in Medicine Research GroupUniversity Hospital of BaselSwitzerland

April 8, 2010

The HeartThe Heart

Life-sustaining pump: 2’500’000 L/year of vital blood

Coronary artery disease (CAD) is most frequent cause of heart malfunction and death

World largest killer (WHO)~29% of global death17’100’000 lives/year

Cardiology yesterdayCardiology yesterday

Tools with relatively low information content

Cardiology todayCardiology today

More tools, more informationSubjective decision mostly relying on experience of a doctor

Cardiology tomorrowCardiology tomorrowMore advanced technologies

Multidimensional informationHigh quality, high resolution dataMultimodal informationQuantitative, objective, integrative computer based analysisWorldwide-networked standards and databases

ProblemsNeed for high performance computing in a distributed environment but only for a fraction of the timeGlobal storage network for storing large datasets

Cardiac UltrasoundCardiac Ultrasound

One of the modern tools for evaluation of the heart function

HF sound waves : No Radiation/IonizationSafe, Non-invasive, Fast, Portable, Cheap“-” Rather low signal to noise ratio

Cardiac UltrasoundCardiac UltrasoundDiagnostic valueDiagnostic value

Heart wall assessment

Cardiac UltrasoundCardiac UltrasoundDiagnostic valueDiagnostic value

Pumping function

Cardiac UltrasoundCardiac UltrasoundDiagnostic valueDiagnostic value

Valve function

3D Cardiac Ultrasound3D Cardiac Ultrasound

Explore heart in 3DFreehand ultrasound (Manual sweeping)

3D Cardiac Ultrasound3D Cardiac Ultrasound

Explore heart in 3DFreehand ultrasound (Manual sweeping)Mechanical sweeping ultrasound (Motor driven)

3D Cardiac Ultrasound3D Cardiac Ultrasound

Explore heart in 3DFreehand ultrasound (Manual sweeping)Mechanical sweeping ultrasound (Motor driven)Live 3D ultrasound (2D arrays with electrical sweeping)

Cardiac UltrasoundCardiac Ultrasound

Ultrasound machine = a transducer + a supercomputer

50’000 – 500’000 USD

Idle 90% of the time

Computational problems Computational problems in Cardiac Ultrasoundin Cardiac Ultrasound

Signal reconstruction

Non-uniformly sampled measurements

Complete gridded or continuous data representation

3D+time signal reconstruction3D+time signal reconstruction

Inherent non-uniformity of scanningSpatial non-uniformitySerialism in scanning

3D+time signal reconstruction3D+time signal reconstruction

Inherent non-uniformity of scanningSpatial non-uniformitySerialism in scanning

Non-uniformity in synchronization (ECG)

3D+time signal reconstruction3D+time signal reconstruction

Inherent non-uniformity of scanningSpatial non-uniformitySerialism in scanning

Non-uniformity in synchronization (ECG)Body motion artifacts (breathing)

4D non-uniform data

I(x,y,z,t) ?

3D+time signal reconstruction3D+time signal reconstructionA spline solutionA spline solution

B-spline non-uniform interpolation by Arigovindan, Unser (EPFL, Switzerland 2005)

Robust global interpolation: handles oversampling and undersampling (gaps) in the dataSparse and well-conditioned alternative to the optimal RBF solutionEnjoys multiresolution properties (way to fast solving)Parallelizability of solving processSuccessfully applied to 2D problems

3D+time signal reconstruction3D+time signal reconstructionA spline solutionA spline solution

Obstacles in 3D/4DComplexity is exponentially dependent on the data size 128 x 128 x 128 x 18 –> 78’752’009’856 non-zeros (312 Gbyte in single precision)

Tensor based approach by Morozov, Hunziker, Unser 2009

Tensor decomposition of the problemRelaxed storage requirementsFeasibility on standard workstations

~9 millions of measurements with size 128 x 128 x 128 x 18 -> 30 minutes on my dual core laptop

3D+time signal reconstruction3D+time signal reconstructionA spline solutionA spline solution

Tensor based approach applied to ultrasound data from continuously rotating transducer

Computational problems Computational problems in Cardiac Ultrasoundin Cardiac Ultrasound

Tissue/blood motion estimationDoppler Ultrasound imaging (State of the art)

Semi-quantitative measurements

Full motion reconstructionGeneralization of B-spline reconstruction to vector valued data (Arigovindan, Unser 2005)Employing additional constraints from physics of fluids (incompressibility, Navier-Stokes equations)

Computational problems Computational problems in Cardiac Ultrasoundin Cardiac Ultrasound

B-spline based tissue motion reconstructionContinuousFully quantifiableCan be combined with Doppler for better robustness

Computational problems Computational problems in Cardiac Ultrasoundin Cardiac Ultrasound

Blood flow reconstructionResolves ambiguity of Doppler measurementsContinuousFully quantifiable

Pathway to distributed supercomputingPathway to distributed supercomputing

Multicore (IBM Power7) claimed 260 GFLOP/chipCluster (UniBasel) 34’500 GFLOP/400 coresGPGPU (ATI 4870X2) 2’000 GFLOP/card GPGPU arrayFPGA accelerator cards: dozens of GFLOP/chip, up to 512 chips per system, low power

In exploration within ICES Microsoft projectCloud - Microsoft Azure

Cloud Ultrasound Processing ServiceCloud Ultrasound Processing Service

ReasonsProcessing of large multidimensional multimodal medical data requires vast computational powerBuilding/maintaining own HPC infrastructure is overly expensiveRelatively rare use of HPC power (few times per day)Availability at multiple points of care (medical practices and hospital emergency rooms)Unified storage/access of the multimodal medical data

Cloud

Cloud Ultrasound Processing ServiceCloud Ultrasound Processing Service

4D acquisition with real-time on board

visualization

4D acquisition with real-time on board

visualization

Interactive web-based visualization of the

result

Interactive web-based visualization of the

resultRendered images and quantitative

information

Visualization/Analysis parameters

Record dataUser

Cloud Ultrasound Processing ServiceCloud Ultrasound Processing Service

Record dataRaw data

180 beams x 500 samples x 100 frames x 10 sec -> 85 MbAdditional information (geometry) -> few Kb

Lossless compressed DICOMLow latency response to the user by sending first a subpart of the data for coarser resolution reconstruction

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Problem is very large for solving using direct solvers ->

use iterative solver Ci+1 = Ci + OP(Ci)OP – linear operator

2 iterations 50 iterations 80 iterations

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Iteration can be distributed relative to the grid

C{1,1} C{1,2}

C{2,1} C{2,2}

C{1,1}i+1 = C{1,1}i + OP{1,1}(Ci)C{1,2}i+1 = C{1,2}i + OP{1,2}(Ci)C{2,1}i+1 = C{2,1}i + OP{2,1}(Ci)C{2,2}i+1 = C{2,2}i + OP{2,2}(Ci)

C{k,m} – solution subpart dedicated to a compute unit

OP{k,m}() – operator applied by a {k,m}’s compute unit

dx

dy

dx, dy – grid spacing

Completely independent output

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Data dependency

OP{k,m}() uses data outside the bounds of C{k,m}

C{k,m}

Extents of dependent input data: 3 samples for cubic splineAt each iteration this data is transferred among adjacent unitsPerformance limiting factor

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Data dependencyData size 512 x 512 x 512 x 64Single precision: 32 GBInfiniband QDR 12X ( ~12GB/s )

Number of units Size of dependent data per unit, MB

Total data transfers for single iteration, MB

Maximal number of iterations/s (excluding CPU time)

64 72 4608 166

128 48 6144 250

256 30 7680 400

512 18 9216 666

1024 12 12288 1000

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Computational loadData size 512 x 512 x 512 x 64Intel® Quad Core 2.67 GHz (~30 GFLOP/s in single precision)PC3-10600 DDR3-SDRAM (30 GB/s)115’000’000 data samplesTotal requirements: ~5000 GFLOP, ~3000 GB of memory transfers

Number of units Maximal number of iterations/s (including inter-unit communication)

64 0.24

128 0.48

256 0.96

512 1.92

1024 3.84

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

MultiresolutionCoarse to scale propagation – getting general from coarser scales and improving details on finer scalesInherent spline inter-scale relation

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Multiresolution in solving algorithmCoarse to scale propagation – getting general from coarser scales and improving details on finer scalesInherent spline inter-scale relationMultigrid solving algorithm

Iterate

Iterate

Iterate

Direct solve

Iterate

Scale 1

Scale 2

Scale N

Projection to coarser scale

Projection to finer scale

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Multiresolution in solving algorithmsCoarse to scale propagation – getting general from coarser scales and improving details on finer scalesInherent spline inter-scale relationMultigrid solving algorithm

Few iterations needed at each scale to get reasonably good solutionWith each coarser scale the cost of iteration decreases exponentiallyIn total requires much less computational load than pure iteration

Cloud Ultrasound Processing Service Cloud Ultrasound Processing Service Signal reconstructionSignal reconstruction

Total requirements per compute unitData size 512 x 512 x 512 x 64 6 scales including finest scale16 iterations at each scale

Motion reconstruction algorithms require ~9 times more computational load

Number of units Memory, GB CPU time, s

64 2.7 62

128 1.36 31.2

256 0.68 15.6

512 0.34 7.8

1024 0.17 3.9

Cloud Ultrasound Processing ServiceCloud Ultrasound Processing Service

Costs estimation Data size 512x512x512x64

Storage, GB 32

Number of instances

1024

Compute hours

0.0011

hour instance∙ 1.13

Switzerland (1/1000 of world population)700 cardiologists/7’000’000 population 1000 echocardiograms per year per cardiologistMultiple views(3) per patientMultiple analyses (3) per view

7’000’000 use cases/year 3043 use cases/hour

Full loaded 3 x 1024 instances(Uniform load in Switzerland)

Collaborative Research enabled by Collaborative Research enabled by the Cloudthe Cloud

Globally available service for processing, storing and accessing medical dataStandardized DICOM interface for unifying data accessInvolve all interested parties around the worldWorld-wide large scale trials are possibleGetting more statistics for rare casesBuilding reference datasets for known cases

ConclusionConclusion

An approach to rapid diagnosis of heart disease using cloud based distributed computing

Replace ultrasound machine’s supercomputer by a cloud service for remote processing and storageMiniaturization of the medical equipment and decrease of its costsAvailability of advanced analysis technologies for objective analysisAvailability at multiple points of careUnified storage and access of medical dataEnables collaborative research