A Fast Multifrontal Solver for DEI - UNIVERSITY OF …geppo/PAPERS/iccs04-poster.pdf · A Fast...

Preview:

Citation preview

A Fast Multifrontal Solver for Non-Linear Multi-Physics Problems

[A. Bertoldo, M. Bianco, G. Pucci] {cyberto, bianco1, geppo} @dei.unipd.it

ACGADVANCED COMPUTING GROUP

DEI - UNIVERSITY OF PADOVA

The multifrontal approachUnifrontal

Sequential assemblystrategy

Recursive assembly strategyin a bottom-up fashion

Multifrontal

June 6-9, 2004Krakow, Poland

Regions

Mesh region E

An elementaryregion

A compositeregion

At each node:1) Assemble the region2) Eliminate fully-summed variables

The assembly tree

Leaves: The assembly phase gets elements from the FEM formulation

At the root we havefully decomposed the linear system into LU factors

Strip phaseIII.Strip factors away for

subsequent use in the finalbackward and forward substitution

Elimination phaseII.Eliminate FS variables

using a blocked UL decomposition

BLASLevel 3

0.0

59

0.6

27

9

1.1

54

7.1

36

0.0

6

0.9

9

2.0

72

23

.26

4

0.1

14

0.6

80

2

1.3

57

12

.59

9

0.0

53

9

1.0

62 4.5

84

33

.19

4

0

5

10

15

20

25

30

35

100 400 625 2500

Number of elements

Ex

ec

uti

on

tim

es

(s

)

PMMS

Unifrontal

MUMPS

SuperLU

31

2

49

4

51

8

66

6

32

5

58

4 64

2

84

6

45

0

55

3 60

6

86

0

74

28

4

31

7 35

4

0

100

200

300

400

500

600

700

800

900

1000

100 400 625 2500

Number of elements

Flo

ps

ra

te (

MF

LO

PS

)

PMMS

Unifrontal

MUMPS

SuperLU

Execution time Flops

PMMS is faster than Unifrontal solver,but they have the same solving kernel

The multifrontal assembly scheme is more efficient

PMMS is faster than both SuperLU andMUMPS for all significant problem sizes

The super-assembly phase and theuse of BLAS boosts the computation

The larger is the problem size, the faster is PMMS with respect to the other solvers

For larger test cases we expect abigger performance improvement

MUMPS and Unifrontal solver exhibitlarger flop rates than PMMS does

They are better tuned but theiralgorithm has higher complexity

Performance ResultsAlgorithm of super-assembly phase

MIUR Center for Science and Applicationof Advanced Computing Paradigms at the University of Padova, Italy

Consorzio Roma RicercheRoma, ItalyInternational Centre

for Mechanical Sciences,Udine, Italy

Ente per le Nuove tecnologie,l’Energia e l’Ambiente, Roma, Italy

Department of InformationEngineering University of Padova, Italy

? IBM Power3@375MHz with 4 GB mem? HPM Toolkit for performance measurements? FE square meshes with 100, 400, 625, and 2500 square

8-node elements

? PMMS is our multifrontal solver? SuperLU version 3.0? MUMPS version 4.3

Assembly phaseIa.

Merge the two reduced components into a new

composite region

Swap phaseIb.

Pack FS rows and columnsat the bottom-right cornerof the non-reduced region

Copy phaseIc.

Copy FS blocks intotemporary buffers

+ +

= Super-assembly phase

Computation properties

Symbolicanalysis

Symbolic dataThese data are computed once at the beginning of

the computation, but used at each iteration to performthe super-assembly phase.

Assembly treetopology

Finite elementmesh data

Simulation of porous mediaunder high temperature

Area of interest

Non-linear coupledmulti-physics problem

Large non-linearsystems of PDEs

LARGE LINEAR SYSTEMS

FEM

Physical model

Mathematical model

Linearization

Our Goal

Main features of our solver

Other specific features:F Multifrontal assembly/elimination strategyF Implicit minimum degree pivotingF Symbolic preprocessing phaseF Super-assembly phaseF Blocked LU decomposition for elimination

+Frontal solvers are direct methods since they first transform the system using Gaussian elimination or LU decomposition, then get the final solution using forward and backward substitution.+They do not operate on the completely

assembled linear system, but rather interleave assembly phases with elimination phases.+They require low memory space and can exploit

efficient dense linear algebra kernels.

Regions can be both elementary and composite. The former are obtained from the finite element formulation. The latter are unions of two component regions from an assembly phase.

The “spalling” phenomenon in a concrete-made pillar

after a simulated fire.

The FE simulation of the “spalling”phenomenon in a (section of)

concrete-made pillar in case of fire

Recommended