19
TERRA SIAM Conference on Parallel Processing for Scientific Computing Tokyo, Japan Sustainability and Efficiency for Simulation Software in the Exascale Era Dominik Thönnes, Ulrich Rüde, Nils Kohl Chair for System Simulation, University of Erlangen-Nürnberg March 09, 2018 joint work with Dominik Bartuschat (FAU) Daniel Drzisga, Markus Huber, Barbara Wohlmuth (TUM) Simon Bauer, Marcus Mohr, Hans-Peter Bunge (LMU)

Sustainability and Efficiency for Simulation Software in ... · TERRA NEO TERRA SIAM Conference on Parallel Processing for Scientific Computing Tokyo, Japan Sustainability and Efficiency

Embed Size (px)

Citation preview

TERRA NEO

TERRA

SIAM Conference on Parallel Processing for Scientific Computing Tokyo, Japan

Sustainability and Efficiency for Simulation Software in the Exascale Era

Dominik Thönnes, Ulrich Rüde, Nils Kohl Chair for System Simulation, University of Erlangen-Nürnberg March 09, 2018

joint work with Dominik Bartuschat (FAU) Daniel Drzisga, Markus Huber, Barbara Wohlmuth (TUM)

Simon Bauer, Marcus Mohr, Hans-Peter Bunge (LMU)

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 2

• Motivated by simulating Earth Mantle convection

• Triangle/Tetrahedral meshes allow modeling of complex

geometries

• Structural refinement enables the use of matrix-free

methods

• Fully distributed data structures allow optimal scalability

• Support different discretizations e.g. first order finite

elements and higher order finite elements, finite volumes

TerraNeo Project

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 3

TerraNeo Project

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 4

TerraNeo Project

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 5

Macro Primitives

Communication

Load Balancing

Neighborhood

Serialization (Buffer / File)

(Simulation) Data

Calculations

Data Topology

intra-primitive building blocks inter-primitive building blocks

Abstraction Data - Topology

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 6

Input Mesh

Mesh File LibraryLoad balancing

Setup Domain

From mesh to primitives (2D)

Fully Distributed Domain

Rank 1

Rank 0

neighborhoodlocal primitives

DistributionGenerate Mesh

Create Primitives

(vertices, edges, faces)

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 7

Load Balancing 2D (Faces)

• Round-Robin

• ParMETIS

• Greedy

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 8

Load Balancing 2D (Edges)

• Round-Robin

• ParMETIS

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 9

Load Balancing 3D (Tetrahedra)

• Round-Robin

• ParMETIS

• Greedy

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 10

Primitive

Metadata: ● globally unique ID ● direct neighborhood (IDs) ● geometric information

(e.g. vertex coordinates, orientation, …)

Lightweight metadata

Macro Primitive Types: Vertex, Edge, Face

P1 FE

P2 FE

Flag Field

Registered / Allocated Data

Arbitrary data structures

Actual simulation data

Data Handling

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018

Control Layer coordinates buffers and directions

Packing Layerinterface for packing / unpacking data to / from buffers

Buffer Layer MPI abstraction

11

Communication

3-layer abstraction

Send / Recv Buffer

int a = 42;

bufferSystem.sendBuffer( rank0 ) << a;

bufferSystem.sendAll();

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018

Rank 1Rank 0

Rank 2

12

Communication

Face

Data

Edge

Data

Edge

Data

Edge

Data

BufferSystem

SendBuffer

SendBuffer

BufferSystem

RecvBuffer

BufferSystem

RecvBuffer

direct copy

unpack

unpack

non-blocking MPI

pack (parallel)

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 13

Data access abstraction

0,4

1,30,3

2,21,20,2

3,12,11,10,1

4,03,02,01,00,0

14

1312

11109

8765

43210

Actual memory indexAbstract index

face_index(level,x,y) => linearized index

face_index(2 ,3,1) => 8

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 14

For Stencil Codes:Indexing which is capable of iterating over all neighbors

Data access abstraction

for (stencilDirection neighbor : allNeighbors)

{

tmp += face_vertex_stencil[neighbor] * src[index(level ,i ,j ,neighbor)];

}

VERTEX CVERTEX W VERTEX E

VERTEX N

VERTEX S

VERTEX NW

VERTEX SE

allNeighbors = {VERTEX_S, VERTEX_SE, VERTEX_W, VERTEX_E, VERTEX_NW, VERTEX_N}

allNeighborsWithCenter = {VERTEX_C, VERTEX_S, VERTEX_SE, VERTEX_W, VERTEX_E, VERTEX_NW, VERTEX_N}

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018

Distribution of unknowns onto the macro primitives

15

Data on Interfaces

Face 1 Edge 1 Face 2

Black points mark the ownership White points are ghost points

Orange points correspond to the same DoF

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 16

Splitting of unknowns

0,3,gr

1,2,gr0,2,gr

2,1,gr1,1,gr0,1,gr

3,0,gr2,0,gr1,0,gr0,0,gr

1,1,gr

2,0,gr1,0,gr

0,2,bl

1,1,bl0,1,bl

2,0,bl1,0,bl0,0,bl

0,3,ho

0,3,ve

0,3,di

1,2,ho

1,2,ve

1,2,di

0,2,ho

0,2,ve

0,2,di

2,1,ho

2,1,ve

2,1,di

1,1,ho

1,1,ve

1,1,di

0,1,ho

0,1,ve

0,1,di

3,0,ho

3,0,ve

3,0,di

2,0,ho

2,0,ve

2,0,di

1,0,ho

1,0,ve

1,0,di

0,0,ho

0,0,ve

0,0,di

Vertex DoF

Cell DoF

0,4

1,30,3

2,21,20,2

3,12,11,10,1

4,03,02,01,00,0

Edge DoF

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 17

Splitting of unknowns

0,3,gr

1,2,gr0,2,gr

2,1,gr1,1,gr0,1,gr

3,0,gr2,0,gr1,0,gr0,0,gr

1,1,gr

2,0,gr1,0,gr

0,2,bl

1,1,bl0,1,bl

2,0,bl1,0,bl0,0,bl

0,3,ho

0,3,ve

0,3,di

1,2,ho

1,2,ve

1,2,di

0,2,ho

0,2,ve

0,2,di

2,1,ho

2,1,ve

2,1,di

1,1,ho

1,1,ve

1,1,di

0,1,ho

0,1,ve

0,1,di

3,0,ho

3,0,ve

3,0,di

2,0,ho

2,0,ve

2,0,di

1,0,ho

1,0,ve

1,0,di

0,0,ho

0,0,ve

0,0,di

Edge DoF

Vertex DoF

Cell DoF

P2 FE

P3 FE

0,4

1,30,3

2,21,20,2

3,12,11,10,1

4,03,02,01,00,0

2x

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 18

• MS102,MS113 Large-Scale Simulation in Geodynamics • TerraNeo - A Finite Element Multigrid Framework for Extreme-Scale Earth Mantle

Convection Simulations - Dominik Bartuschat • A Stencil Scaling Approach for Accelerating Matrix-Free Finite Element

Implementations - Daniel Drzisga

• MS107 Highly Scalable Solvers for Computational PDEs • Matrix-Free Parallel Multigrid for Fe Systems with a Trillion Unknowns - Markus Huber

Other interesting talks

Dominik Thönnes | Chair for System Simulation | Sustainability and Efficiency for Simulation Software in the Exascale Era

TERRA NEO

TERRA

09.03.2018 19

Annulus convection

• P1-P1 PSPG Elements for Stokes • Finite volumes for heat transport • 256 macro triangles • 2145 DoFs on each macro (level 6)