27
Open Problems of Petascale Molecular-Dynamics Simulations D. Grancharov, E. Lilkova, N. Ilieva , P. Petkov and L. Litov Supercomputing Applications in Science and Industry Sept. 20-21, Sunny Beach, Bulgaria University of Sofia “St. Kl. Ohridski”, Faculty of Physics Institute for Nuclear Research and Nuclear Energy – BAS

Open Problems of Petascale Molecular-Dynamics Simulations

  • Upload
    conroy

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Supercomputing Applications in Science and Industry Sept. 20-21, Sunny Beach, Bulgaria. Open Problems of Petascale Molecular-Dynamics Simulations. D. Grancharov , E. Lilkova , N. Ilieva , P. Petkov and L. Litov. University of Sofia “St. Kl . Ohridski ”, Faculty of Physics. - PowerPoint PPT Presentation

Citation preview

Page 1: Open Problems of Petascale Molecular-Dynamics Simulations

Open Problems of PetascaleMolecular-Dynamics

SimulationsD. Grancharov, E. Lilkova, N. Ilieva, P. Petkov and L.

Litov

Supercomputing Applications in Science and IndustrySept. 20-21, Sunny Beach, Bulgaria

University of Sofia “St. Kl. Ohridski”,Faculty of Physics

Institute for Nuclear Research and Nuclear Energy – BAS

Page 2: Open Problems of Petascale Molecular-Dynamics Simulations

2

Content

1. Introduction2. Molecular dynamics in brief3. ODE integrators4. Scalability of the MD-packages GROMACS and NAMD

in simulations of large systems5. Workload distribution on the computing cores in the

MD simulations6. pp:pme ratio optimization7. Outlook

Nevena IlievaPRACE Regional Conference

“Supercomputing Applications in Science and Industry”High-performance MD-simulations

Page 3: Open Problems of Petascale Molecular-Dynamics Simulations

3

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Method for investigation of time evolution of atomic and molecular systems

Classical description of the systems; Empirical parametrisation of the interaction potential between atoms

and molecules – molecular force field; The force field is conservative, depending on atoms positions only,

pair-aditive (NB: cut-offs, boundary conditions).

... i

ei

vi

ti

ai

s VVVVVV

Bond strength

Bond angle

Torsion Van der Waals interac-tions

Coulomb interaction

Page 4: Open Problems of Petascale Molecular-Dynamics Simulations

4

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Probability to find the system at (r, t)),(Ψ),(Ψ=),( * txtxtxP

CM: Newton equation

QM: Schrödinger equation

xVF

22

dt

xdmamF

MD

dttvmdF

dtxdtv

/

/

Page 5: Open Problems of Petascale Molecular-Dynamics Simulations

5Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Hamiltonian nature of the investigated dynamics

Nevena IlievaPRACE Regional Conference

“Supercomputing Applications in Science and Industry”High-performance MD-simulations

qpqHp /, ppqHq /,

Case of quadratic kinetic energy pqMpTqqMpqqMqqqT TT 1

2

1

2

1,

qVpqTpqH ,,

If, in addition, qVpTpqH , pqH , separable

Flow h symplectic transformation (Theorem, Poincaré, 1899) :,, htphtqtptq

0

0

0

0

I

I

I

Ih

Th

nnHhnn pqpq ,, ,11

Page 6: Open Problems of Petascale Molecular-Dynamics Simulations

6Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Nevena IlievaPRACE Regional Conference

“Supercomputing Applications in Science and Industry”High-performance MD-simulations

Symplectic, i.e.:

preserving the symplectic form

ii dpdq

preserving oriented areas in phase space

Most of the usual numerical methods (primitive Euler, classical Runge-Kutta)are not symplectic integrators Encke/Störmer/leap frog/Verlet:

112/112/1 ,,,, nnnnnnnn pqpqpqpq

Extensions: Fixed step size variable step sizeSingle-step multiple-step, multirate

112/1 2 nnnnn qFFFh

pp

12/112/11

1 2

nnnnnn Fh

ppphMqq

Ex.:

- resonances- most rapid vibrational mode- implicit IA: nonlinear; still resonance

Page 7: Open Problems of Petascale Molecular-Dynamics Simulations

7Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Nevena IlievaPRACE Regional Conference

“Supercomputing Applications in Science and Industry”High-performance MD-simulations

Composition methods

0

111

1

1

r

sr

s

]1[][:...1

rrhhhh s

Splitting methods yfyfyyfy ]2[]1[

Ex. Symplectic Euler & Störmer-Verlet schemes

)(

0

pTq

p

p

0

)(

q

qVp

)()(

)(

00

0

pTtqtq

ptp

p

0

00

)(

)()(

qtq

qVtptp q

T

U

Uh

Th

Uh 2/2/ U

hTh

Page 8: Open Problems of Petascale Molecular-Dynamics Simulations

8

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Combining exact and numerical flows

Splitting in more than two vector fields

]2[]1[hhh

]1[]2[]1[

][]2[]1[

...

)(...)()(

hhhh

N yfyfyfy

Integrators based on generating functions

Variational integrators

),(...),(),(),(

),(),(

22

11

11

qpGhqpGhqphGqpS

qpqp

rr

nnnn

Discretizing the action integral

Page 9: Open Problems of Petascale Molecular-Dynamics Simulations

9

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Integration algorithms with variable time step improved performance degradation of accuracy (trade-off vs. structure preserving) accurate trajectories high-order methods, small timesteps high-order {} structural properties (E, P) deficiency in long-term performance complicated, unstable, chaotic traj’s: asm uch structure as possible loss of symplecticness simple variable / symmetrized time step different ’s in different phase-space regions (computational cost)

Multirate methods (processes subsystems of the ODE system)

e.g. 2-, 3-, and 4-body interactions

Page 10: Open Problems of Petascale Molecular-Dynamics Simulations

10

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

point in the phase space of the

systemHamilton’s

equations Liouville

operator

in Cartesian coordinates)(2/2 xVmpVTH i

ii

и do not

commute

i

i = { , H}

Page 11: Open Problems of Petascale Molecular-Dynamics Simulations

11

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

tmxFv

x

v

xtUF )/)(()(

v

tvx

v

xtU v )(

One-step propagators;step t; apply on

tFm

tFm

vv nnnn 11 2

1

2

1

ttFm

vxx nnnn

2

11

Page 12: Open Problems of Petascale Molecular-Dynamics Simulations

12

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

1121 /:;,0... iiiiii MrrFrr Fi “softer” than Fi+1

)2

exp(...)](exp[)2

exp(}),{exp( 00

21000

0 KKKDKH

)2

exp()]2

exp(...)(exp[)2

)[exp(2

exp( 00

11

32111

00 1 KKKKDKK M

TVKTD ii ,},,{

0

01

001

001

00

021 24

exp2

exp2

exp2

exp42

exp: KKDKDKKrrr

Ex.:

Overall step 0, but effectively till the i-th itteration, if

0...21 ii KK

Page 13: Open Problems of Petascale Molecular-Dynamics Simulations

13

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

MD-simulation performance for large systems (105 atoms and more): scalability, distribution of the computational load its dependence on the functional assignment to the individual processors

epidermic growth factor ~5 x 10 5 atoms

satellite of the tobacco mosaic virus ~10 6 atoms

E.Coli ribosome in water ~2,2 x 10 5 atoms

Page 14: Open Problems of Petascale Molecular-Dynamics Simulations

14

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

System size (number of

atoms)

Number of cores

NAMD CVS 2011-02-19 GROMACS 4.5.4

Performance[ns/day]

Speed upPerformance [ns/day]

Speed up

465 399

8 192

12.21 6.68 2.57 2.74

4 096

10.24 5.60 5.32 5.68

2 048

6.08 3.33 3.08 3.29

1 024

3.14 1.72 1.84 1.96

512

1.83 1.00 0.94 1.00

1 007 930

8 192

11.37 11.08    

4 096

7.03 6.85    

2 048

3.57 3.47    

1 024

1.97 1.92    

512

1.03 1.00    

2 233 537

8 192

5.86 12.53    

4 096

3.44 7.36    

2 048

1.80 3.84    

1 024

0.92 1.97    

512

0.47 1.00    

Compiled with the XL compilers of IBM for the architecture of the computing nodes of BlueGene/P; NAMD – compresed input data; Peculiarity in the way the data is loaded into the RAM memory of the computing cores of IBM BlueGene/P: GROMACS up to 700000 atoms

Page 15: Open Problems of Petascale Molecular-Dynamics Simulations

15

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Performance: the simulation time to be obtained for 24 hours with integration step of 2fs

Speed-up: the performance at 512 computing cores as reference value

Page 16: Open Problems of Petascale Molecular-Dynamics Simulations

16

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Distribution of the system parts among the computing cores: particle decomposition (~ N x N/2) domain decomposition long-range interactions: PME algorithm

SCALASCA profiling tool: guides the optimization of parallel programs by measuring and analyzing their behavior during the run instrumentation of the code starting the instrumented code data analysis (Cube 3) estimation of the efficiency, speed and parallelization behavior of the algorithms in use

Page 17: Open Problems of Petascale Molecular-Dynamics Simulations

17

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

в

Distribution of the communications: (а) interleave; (b) pp_pme; (c) Cartesian (red – higher intensity, yellow – lower intensity).

103079 atoms10000 steps x 2 fs = 20 pspbc; Berendsen thermostatno LINCS or P-LINCS used

Page 18: Open Problems of Petascale Molecular-Dynamics Simulations

18

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performancepp:pme optimizationOutlook

ddorder regimeNumberof computing cores

interleave(the default)

[ns/day]

pp_pme[ns/day]

Cartesian[ns/day]

512 6.672 6.592 6.6001024 12.122 11.905 11.9732048 20.856 20.627 20.4264096 27.994 31.306 31.544

Up to 2048 cores – similar performance On 4096 cores the default mode is the slowest one

Page 19: Open Problems of Petascale Molecular-Dynamics Simulations

19

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Test system of ~ 465000 atoms 200 steps 1/8 of all cores – pme cores 512 and 1024 cores (51 GB output data on 1024 cores)

total time 3.10 6 s execution time 2,3.10 6 s t/core (average) 4668 s (pme 5888 s; pp 4494 s) ~ 70 % do_md ~ 30 % long-range electrostatics &domain decomposition on the cores

communications 3,18.10 6 do_md ~ 58 % long-range el. ~ 21 % initialization of envir. ~ 20 % pme 10453 & pp 4174

512 cores

Page 20: Open Problems of Petascale Molecular-Dynamics Simulations

20

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

total time 3.10 6 s execution time 2,3.10 6 s t/core (average) 4912 s(pme 6829 s; pp 4637 s) ~ 66,6 % do_md ~ 30 % long-range electrostatic &domain decomposition on the cores

communications 7,1.10 6 do_md ~ 60 % long-range el. ~ 22 % initialization of envir. ~ 18 % pme 13600 & pp 6100 1024 cores

Page 21: Open Problems of Petascale Molecular-Dynamics Simulations

21

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Test system of ~ 200000 atoms 2000 steps pme:pp 1:1 to 1:3 (16 8 out of 32 cores) g_tune_pme cut-off radius 0.9 nm 1.15 nm

strong case-dependence of the most appropriate parameter set

Page 22: Open Problems of Petascale Molecular-Dynamics Simulations

22

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

The increasing size and complexity of the investigated objects press strongly on the reconsideration of the existing algorithms not only because of the exploding computation volumes but also because of the poor scalability with the number of the processors employed;

The performed investigations allow to clearly identify the main reasons for the increase of communication between the computing cores and thus for damping down the scalability of the code;

The multiple-time step symplectic integration algorithm we work on aims at resolving this problem.

Page 23: Open Problems of Petascale Molecular-Dynamics Simulations

23

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

Page 24: Open Problems of Petascale Molecular-Dynamics Simulations

24

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

T.F. Miller III, M. Eleftheriou, P. Pattnaik, A. Ndirango, D. Newns, and G. J. Martyna, J. Chem. Phys. 116 (2002) 8649. S. Nosé, J. Phys. Soc. Jpn. 70 (2001) 75. R. Skeel, J.J. Biesiadecki, Ann. Num. Math. 1 (1994) 1–9. D. Janezic and M. Praprotnik, J. Chem. Inf. Comput. Sci. 43 (2003) 1922–1927. M. Tao, H. Owhadi, J.E. Marsden, Symplectic, linearly-implicit and stable integrators with applications to fast symplectic simulatons of constrained dynamics, e-Print arXiv: 1103.4645 (2011). Wei He and Sanjay Govindjee, Application of a SEM Preserving Integrator to Molecular Dynamics, Rep. No. UCB/SEMM-2009/01, Jan 2009, Univ. of California, Berkley, 27 pp. E. Hairer, C. Lubich, and G. Wanner, Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations. (Springer, Heidelberg, Germany, second ed., 2004).

Page 25: Open Problems of Petascale Molecular-Dynamics Simulations

25

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Molecular DynamicsODE IntegratorsScalability of GROMACS and NAMD (above 2048 cores) Workload distribution and dd_order performance pp:pme optimizationOutlook

R.S. Herbst, Int. J. Radiat. Oncol. Biol. Phys. 59 (2 Suppl) (2004) 21–6: H. Zhang, A. Berezov, Q. Wang, G. Zhang, J. Drebin, R. Murali, M.I. Greene, J. Clin. Invest. 117/8 (2007) 2051-2058. F. Walker, L. Abramowitz, D. Benabderrahmane, X. Duval, V.R. Descatoire, D. Hénin, T.R.S. Lehy, T. Aparicio, Human Pathology 40/11 (2009) 1517–1527. http://www.ks.uiuc.edu/Research/STMV/ E. Villa et al., Proc. Natl. Acad. Sci. USA 106 (2009) 1063–1068. K. Y. Sanbonmatsu and C.-S. Tung1. Journal of Physics: Conference Series 46 (2006) 334–342.

Page 26: Open Problems of Petascale Molecular-Dynamics Simulations

26

Spare slides

PRACE Regional Conference “Supercomputing Applications in Science and Industry”

Nevena IlievaHigh-performance MD-simulations

Page 27: Open Problems of Petascale Molecular-Dynamics Simulations

L. Litov Computer aided drug design

Error at every step

Accumulated error

Algorithms for solving the equation of motionAlgorithms for solving the equation of motion