View
120
Download
3
Embed Size (px)
DESCRIPTION
Obtaining a good load balance is a significant challenge in scaling up lattice-Boltzmann simulations of realistic sparse problems to the exascale. Here we analyze the effect of weighted decomposition on the performance of the HemeLB lattice-Boltzmann simulation environment, when applied to sparse domains. Prior to domain decomposition, we assign wall and in/outlet sites with increased weights which reflect their increased computational cost. We combine our weighted decomposition with a second optimization, which is to sort the lattice sites according to a space filling curve. We tested these strategies on a sparse bifurcation and very sparse aneurysm geometry, and find that using weights reduces calculation load imbalance by up to 85%, although the overall communication overhead is higher than some of our runs.
Citation preview
Weighted DecompositionAre some lattice sites more
equal than others?
Derek GroenDavid Abou ChacraJiri JarosRupert NashMiguel BernabeuPeter Coveney
Overview
● Cerebrovascular bloodflow and HemeLB.● What type performance matters.● Weighted decomposition● Tests & results● Discussion / Future work
Cerebrovascular diseases● Stroke is the main cause of about 1.1M deaths per year
in Europe.● ~15% are caused by bleeding in the brain.● We aim to accurately model cerebral bloodflow, and to
eventually provide assistance with cerebrovascular surgery.
Features in HemeLB1. Generation of computational models from medical
images.2. A wide range of collision kernels and boundary
conditions.3. Sparse geometries with negligible overhead (using
ParMETIS).4. Streaming visualization and steering of the simulation.5. Coupling to external models.
Decomposition
Domain decomposition
current1. read in blocks2. pass the geometry
to ParMETIS3. ParMETIS uses k-
way partitioning.
proposed changesdiffer the weights on lattice sites,use a space filling curve (z-ordering) prior to partitioning.
Obtaining weights
Site type Weights Intel SandyBridge Weights AMD Bulldozer
Bulk 10.0 10.0
Wall 18.708 20.226
In/outlet 40.037 37..398
Wall+ in/outlet (22.700) (34.577)
We ran 5 test simulations with cylinders of different aspect ratios to obtain an estimate of the calculation cost of each site type.
ParMETIS tends to perform worse when weight values are large.
4 4 4
4 4 4
16
16
16 8
8 8
8 8
8 8
8
8
8 8
8 8 8
88
16
16
16
Test setup650,492lattice sites10% fluid
5,667,778lattice sites1.5% fluid
D3Q19BFL boundaries
Bifurcation geometryHECToR XE6
Aneurysm geometryHECToR XT6 and ARCHER XC30
HECToR
ARCHER
HECToR
ARCHER
HECToR
ARCHER
Observations/Future work
● Some lattice sites are more equal than others.
● Getting a load-balanced decomposition is a major challenge.
Full Paper
A full paper will be available in the proceedings of EASC 2014.
If you would like a preprint version, please contact me at [email protected]
Thank you!
UKCOMESUK Consortium On Mesoscale
Engineering and Science.