View
229
Download
3
Category
Tags:
Preview:
Citation preview
Welcome to Daresbury Laboratory
Alice’s Wonderland (1865)Lewis Carroll (Charles Lutwidge
Dodgson)
DL_SOFTWARE TUTORIALSoftware Solutions in Atomistic
ModellingILIAN TODOROV, CHIN YONGMICHAEL SEATON, JOHN PURTONDAVID GUNN, ANDREY BRUKHNO
WILLIAM SMITH
SCD, STFC DARESBURY LABORATORY, DARESBURYWARRINGTON WA4 4AD, CHESHIRE, ENGLAND, UK
Multiple Scales of Materials Modelling
MS&MD via DL_POLY
DPD & LB via DL_MESO
KMC via DL_AKMC
FF mapping
via DL_FIELD
MC
via
D
L_M
ON
TE
Coarse graining
via DL_CGMAP
Part 1Part 1
DL_POLY Project BackgroundDL_POLY Project Background
DL_POLY Trivia
• General purpose parallel (classical) MD simulation software • It was conceived to meet needs of CCP5 - The Computer Simulation of Condensed Phases (academic collaboration community)• Written in modularised Fortran90 (NagWare & FORCHECK compliant) with MPI2 (MPI1+MPI-I/O) & fully self-contained
• 1994 – 2010: DL_POLY_2 (RD) by W. Smith & T.R. Forester (funded for 6 years by EPSRC at DL). In 2010 moved to a BSD open source licence as DL_POLY_Classic.
• 2003 – 2010: DL_POLY_3 (DD) by I.T. Todorov & W. Smith (funded for 4 years by NERC at Cambridge). Up-licensed to DL_POLY_4 in 2010 – free of charge to academic researchers and at cost to industry (provided as source).
• ~ 15,500 licences taken out since 1994 (~1,600 annually)• ~ 2,600 e-mail list and ~1,300 FORUM members (since 2005)
Written in modularised free formatted F90 (+MPI) with rigorous code syntax (FORCHECK and NAGWare verified) and no external library dependencies• DL_POLY_4 (version 5.1)
– Domain Decomposition parallelisation, based on domain decomposition (no dynamic load balancing), limits: up to ≈2.1×109 atoms with inherent parallelisation
– Parallel I/O (amber netCDF) and radiation damage features
– Free format (flexible) reading with some fail-safe features and basic reporting (but not fully fool-proofed)
• DL_POLY_Classic (version 1.9)– Replicated Data parallelisation, limits up to ≈30,000
atoms with good parallelisation up to 100 (system dependent) processors (running on any processor count)
– Hyper-dynamics, Temperature Accelerated Dynamics, Solvation Dynamics, Path Integral MD
– Free format reading (somewhat rigid)
Current Versions
DL_POLY on the Web
WWW:WWW:
http://www.ccp5.ac.uk/DL_POLY/http://www.ccp5.ac.uk/DL_POLY/
FTP:ftp://ftp.dl.ac.uk/ccp5/DL_POLY/
DEV:http://ccpforge.cse.rl.ac.uk/gf/project/dl-poly/
http://ccpforge.cse.rl.ac.uk/gf/project/dl_poly_classic/
FORUM:http://www.cse.stfc.ac.uk/disco/forums/
ubbthreads.php/
W. Smith and T.R. Forester,J. Molec. Graphics (1996), 14, 136
W. Smith, C.W. Yong, P.M. Rodger,Molecular Simulation (2002), 28, 385
I.T. Todorov, W. Smith, K. Trachenko, M.T. Dove,J. Mater. Chem. (2006), 16, 1611-1618
W. Smith (Guest Editor),Molecular Simulation (2006), 32, 933
I.J. Bush, I.T. Todorov and W. Smith,Comp. Phys. Commun. (2006), 175, 323-329
Further Information
DL_POLY_DD Development Statistics
Lines [x 1,000]Comment 4.0Blank 5.6Total 36.5Manual
178 p
DL_POLY_3.01
DL_POLY_4.05
Lines [x 1,000]excluding CUDA portComment 14.5Blank 31.5Total 128.9Manual317 p
-----------s
ingle
developer-----------
reengineered
reen
gin
eere
d
reengineering
DL_POLY Licence Statistics
Valid e-mail list at end of2010 :: DL_POLY (2+3+MULTI) - 1,000 (list end)2012 :: DL_POLY_4 - 2,000 (list start 2011)
web-registration
Part Part 22
The Molecular Dynamics MethodThe Molecular Dynamics Method
Molecular Dynamics: Definitions
• Theoretical tool for modelling the detailed microscopic behaviour of many different types of systems, including; gases, liquids, solids, polymers, surfaces and clusters.• In an MD simulation, the classical equations of motion governing the microscopic time evolution of a many body system are solved numerically, subject to the boundary conditions appropriate for the geometry or symmetry of the system.• Can be used to monitor the microscopic mechanisms of energy and mass transfer in chemical processes, and dynamical properties such as absorption spectra, rate constants and transport properties can be calculated.• Can be employed as a means of sampling from a statistical mechanical ensemble and determining equilibrium properties. These properties include average thermodynamic quantities (pressure, volume, temperature, etc.), structure, and free energies along reaction paths.
MD simulations are used for:
• Microscopic insight: we can follow the motion of a single molecule (glass of water)
• Investigation of phase change (NaCl)• Understanding of complex systems
like polymers (plastics – hydrophilic and hydrophobic behaviour)
Molecular Dynamics for Beginners
Example: Simulation of Argon
rrcutcut
612
4)(rr
rV
Pair Potential:Pair Potential:
Lagrangian:Lagrangian:
L r v m v V ri i i ii
N
ijj ii
N
( , ) ( )
1
22
1
Lennard -Jones Potential
612
4)(rr
rV
V(r)V(r)
rr
rrcutcut
Pair-wise radial distancePair-wise radial distance
Equations of Motion
d
dt
L
v
L
ri i
)( ijiij
N
ijiji
iii
rVf
fF
Fam
Lagrange Equation Lagrange Equation – time evolution– time evolution
Force Evaluation – Force Evaluation – particle particle interactionsinteractions
Boundary Conditions
• None – biopolymer simulations
• Stochastic boundaries – biopolymers
• Hard wall boundaries – pores, capillaries
• Periodic boundaries – most MD simulations
2D cubic periodic2D cubic periodic
Periodic Boundary Conditions
TriclinicTriclinic
Truncated octahedronTruncated octahedron
Hexagonal prismHexagonal prism
Rhombic dodecahedronRhombic dodecahedron
• Kinetic Energy:
• Temperature:
• Configuration Energy:
• Pressure:
• Specific heat:
K E m vi ii
N
. . 12
2
TNk
K EB
2
3. .
U V rc ijj i
N
i
( )
N
iiiB frTNkPV
3
1
v
BBNVEc C
NkTNkU
2
31
2
3)( 222
System Properties: Static (1)
Structural Properties
– Pair correlation (Radial Distribution Function):
– Structure factor:
– Note: S(k) available from x-ray diffraction
System Properties: Static (2)
g rn r
r r
V
Nr rij
j i
N
i
( )( )
( )
4 2 2
drrrgkr
krkS 2
01)(
)sin(41)(
RR
RR
Radial Distribution Function (RDF)
g(r)g(r)
separation (r)separation (r)
1.01.0
Typical RDF
Single correlation functions:
Mean squared displacement (Einstein relation)
Velocity Autocorrelation (Green-Kubo relation)
2Dt= 13
⟨∣r i( t )−r i(0 )∣2 ⟩
D=130
∞⟨v i(t )⋅v i(0 )⟩dt
System Properties: Dynamic (1)
Collective Correlation Functions: DL_POLY GUI
• General van Hove correlation function
• van Hove self-correlation function
• van Hove distinct correlation function
N
jiji trrr
NtG
1,
)]()0([1
),( r
N
iiis trrr
NtG )]()0([
1),( r
N
i
N
ijjid trrr
NtG )]()0([
1),( r
System Properties: Dynamic (2)
Correlation Function Uses
• Complete description of bulk dynamical properties
• Space-time Fourier Transform of van Hove function
• Elastic properties of materials• Energy dissipation• Sound propagationObtained directly from neutron scattering
Part 2Part 2
DL_POLY Basics & AlgorithmsDL_POLY Basics & Algorithms
Supported Molecular Entities
Point ionsand atoms
Polarisable ions
(core+shell)Flexible
moleculesConstraint
bonds
Rigidmolecules
Flexiblylinked rigidmolecules
Rigid bondlinked rigidmolecules
Force Field Definitions – I
• particle: a rigid ion or atom (charged or not), a core or a shell of a polarisable ion (with or without associated degrees of freedom), a massless charged site. A particle is a countable object and has a global ID index.• site: a particle prototype that serves to define the chemical & physical nature (topology/connectivity/stoichiometry) of a particle (mass, charge, frozen-ness). Sites are not atoms they are prototypes!• Intra-molecular interactions: chemical bonds, bond angles, dihedral angles, improper dihedral angles, inversions. Usually, the members in a unit do not interact via an inter-molecular term. However, this can be overridden for some interactions. These are defined by site.• Inter-molecular interactions: van der Waals, metal (2B/E/EAM, Gupta, Finnis-Sinclair, Sutton-Chen), Tersoff, three-body, four-body. Defined by species.
Force Field Definitions – II
• Electrostatics: Standard Ewald*, Hautman-Klein (2D) Ewald*, SPM Ewald (3D FFTs), Force-Shifted Coulomb, Reaction Field, Fennell damped FSC+RF, Distance dependent dielectric constant, Fuchs correction for non charge neutral MD cells.
• Ion polarisation via Dynamic (Adiabatic) or Relaxed shell model.
• External fields: Electric, Magnetic, Gravitational, Oscillating & Continuous Shear, Containing Sphere, Repulsive Wall.
• Intra-molecular like interactions: tethers, core shells units, constraint and PMF units, rigid body units. These are also defined by site.
• Potentials: parameterised analytical forms defining the interactions. These are always spherically symmetric!
• THE CHEMICAL NATURE OF PARTICLES DOES NOT CHANGE IN SPACE AND TIME!!! *
Force Field by Sums
i
N
1iexternal
N
ijishell-coreshell-core
N
i0tttethertether
N
idcbainversinvers
N
idcbadiheddihed
N
icbaangleangle
N
ibabondbond
N'
ji,
N'
ji,jiij
N
ijipairmetal
N'
nk,j,i,nkjibody4
N'
kj,i,kjibody3
N'
kj,i,kjiTersoff
N'
ji, ji
ji
0
N'
ji,jipairN21
rΦ|rr|,iUr,r,iU
r,r,r,r,iUr,r,r,r,iU
r,r,r,iUr,r,iU
)|rr|(ρF|)rr(|Vε
r,r,r,rUr,r,rUr,r,rU
|rr|
4π
1|)rr(|U)r,.....,r,rV(
-shellcore
-shellcore
tether
tether
invers
invers
dihed
dihed
angle
angle
bond
bond
Boundary Conditions
• None (e.g. isolated macromolecules)
• Cubic periodic boundaries
• Orthorhombic periodic boundaries
• Parallelepiped (triclinic) periodic boundaries
• Truncated octahedral periodic boundaries*
• Rhombic dodecahedral periodic boundaries*
• Slabs (i.e. x,y periodic, z non-periodic)
DL_POLY is designed for homogeniousdistributed parallel machines
M1 P1
M2 P2
M3 P3
M0 P0 M4P4
M5P5
M6P6
M7P7
Assumed Parallel Architecture
InitializeInitialize
ForcesForces
MotionMotion
StatisticsStatistics
SummarySummary
InitializeInitialize
ForcesForces
MotionMotion
StatisticsStatistics
SummarySummary
InitializeInitialize
ForcesForces
MotionMotion
StatisticsStatistics
SummarySummary
InitializeInitialize
ForcesForces
MotionMotion
StatisticsStatistics
SummarySummary
AA BB CC DD
Replicated Data Strategy – I
• Every processor sees the full system
• No memory distribution (performance overheads and limitations increase with increasing system size)
• Functional/algorithmic decomposition of the workload
• Cutoff ≤ 0.5 min system width
• Extensive global communications (extensive overheads increase with increasing system size)
Replicated Data Strategy – II
1,2 1,3 1,4 1,5 1,6 1,7
2,3 2,4 2,5 2,6 2,7 2,8
3,4 3,5 3,6 3,7 3,8 3,9
4,5 4,6 4,7 4,8 4,9 4,10
5,6 5,7 5,8 5,9 5,10 5,11
6,7 6,8 6,9 6,10 6,11 6,12
7,8 7,9 7,10 7,11 7,12
8,9 8,10 8,11 8,12 8,1
9,10 9,11 9,12 9,1 9,2
10,11 10,12 10,1 10,2 10,3
11,12 11,1 11,2 11,3 11,4
12,1 12,2 12,3 12,4 12,5
PP00
PP00
PP00
Distributed list!
Parallel (RD) Verlet List
AA BB
CC DD
Domain Decomposition MD
• Linked lists provide an elegant way to scale short-ranged two body interactions from O(N2/2) to ≈O(N). The efficiency increases with increasing link cell partitioning – as a rule of thumb best efficacy is achieved for cubic-like partitioning with number of link-cells per domain ≥ 4 for any dimension.
• Linked lists can be used with the same efficiency for 3-body (bond-angles) and 4-body (dihedral & improper dihedral & inversion angles) interactions. For these the linked cell halo is double-layered and since cutoff3/4-body ≤ 0.5*cutoff2-body which makes the partitioning more effective than that for the 2-body interactions.
• The larger the particle density and/or the smaller the cutoff with respect to the domain width, (the larger the sub-selling and the better the spherical approximation of the search area), the shorter the Verlet neighbour-list search.
Linked Cell Lists
6
1 2 3 4 5
LinkCell 2
6
10
12
16
17
Head of ChainList
1 2 3 4 5 6 7 8 9 2019181716151413121110
10 12 16 17 0LinkList
Atom number
Cell number
Linked Cell List Idea
• Provides dynamically adjustable workload for variable local density and VNL speed up of ≈ 30% (45% theoretically).
• Provides excellent serial performance, extremely close to that of Brode-Ahlrichs method for construction of the Verlet neighbour-list when system sizes are smaller < 5000 particles.
1 2 3 4 5 6 7
Sub-celling of LCs
• Bonded forces:- Algorithmic decomposition for DL_POLY_C- Interactions managed by bookkeeping
arrays, i.e. explicit bond definition!!!- Shared bookkeeping arrays
• Non-bonded forces:– Distributed Verlet neighbour list (pair forces)– Link cells (3,4-body forces)
• Implementations differ between DL_POLY_4 & C!
Parallel Force Calculation
Molecular Molecular force fieldforce fielddefinitiondefinition
Glo
bal Fo
rce F
ield
Glo
bal Fo
rce F
ield
PP00LocalLocalforceforcetermsterms
PP11LocalLocalforceforcetermsterms
PP22LocalLocalforceforcetermsterms
Pro
cess
ors
Pro
cess
ors
DL_POLY_C and Bonded Forces
Glo
bal fo
rce fi
eld
Glo
bal fo
rce fi
eld
PP00LocalLocalatomicatomicindicesindices
PP11LocalLocalatomicatomicindicesindices
PP22LocalLocalatomicatomicindicesindices
Pro
cess
or
Dom
ain
sPro
cess
or
Dom
ain
s
Tricky!Tricky!Molecular Molecular force fieldforce fielddefinitiondefinition
DL_POLY_4 and Bonded Forces
EE
AA
BB
CC
DDFF
GG
HH 11
22
33
44
55
66
77
Replicated dataReplicated data Domain decompositionDomain decomposition
Topology Distribution
AA11 AA33 AA55 AA77 AA99 AA1111 AA1313 AA1515 AA1717
AA22 AA44 AA66 AA88 AA1010 AA1212 AA1414 AA1616
PP00 PP11 PP22 PP33 PP44 PP55 PP66 PP77 PP88
RD Distribution Scheme: Bond Forces
AA BB
CC DD
AA11
AA33
AA55
AA77
AA99
AA1111
AA1313
AA1515
AA1717
AA22
AA44
AA66
AA88
AA1010
AA1212
AA1414
AA1616
DD Distribution Scheme: Bond Forces
Ensembles and Algorithms
Integration:
Available as velocity Verlet (VV) or leapfrog Verlet (LFV) generating flavours of the following ensembles• NVE
• NVT (Ekin) Evans
• NVT Andersen^, Langevin^, Berendsen, Nosé-Hoover, GST• NPT Langevin^, Berendsen, Nosé-Hoover, Martyna-Tuckerman-Klein^
• NT/NPnAT/NPnT Langevin^, Berendsen, Nosé-Hoover, Martyna-Tuckerman-Klein^
Constraints & Rigid Body Solvers: • VV dependent – RATTLE, No_Squish, QSHAKE*• LFV dependent – SHAKE, Euler-Quaternion, QSHAKE*
Integration Algorithms
Essential Requirements:
• Computational speed• Low memory demand• Accuracy• Stability (energy conservation, no drift)• Useful property - time reversibility• Extremely useful property –
symplecticness = time reversibility + long term stability
r (t)
r (t+t)v (t)t
f(t)t2/m
Net
displacement
r’ (t+t)
[r (t), v(t), f(t)] [r (t+t), v(t+t), f(t+t)]
Integration: Essential Idea
Simulation Cycle and Integration Schemes
SetupSetup
ForcesForces
MotionMotion
Stats.Stats.
ResultsResults
Set up initial Set up initial systemsystem
Calculate Calculate forcesforces
Calculate Calculate motionmotion
Accumulate Accumulate statistical datastatistical data
Summarise Summarise simulationsimulation
Taylor expansion:
)()()(.3
)()()(.2
)(.1
)(),(.0
21
21
21
21
ttvttxttx
m
tftttvttv
afreshcalculatedtf
ttvtx
iii
i
iii
i
ii
Leapfrog Verlet (LFV)
Velocity Verlet (VV)
i
iii
i
iii
i
iii
iii
m
ttftttvttv
afreshcalculatedttf
ttvt
txttx
m
tfttvttv
tftvtx
)(
2)()(.VV2.1
)(.0VV2.
)(2
)()(.2VV1.
)(
2)()(.1VV1.
)(),(),(.0VV1.
21
21
21
32
1 2tO
m
ftvtrr nnnn
Integration Algorithms: Leapfrog Verlet
Discrete timeDiscrete time
rrn+1n+1rrnnrrn-1n-1rrn-2n-2
vvn+1/2n+1/2
f f nnf f n-1n-1f f n-2n-2
vvn-1/2n-1/2vvn-3/2n-3/2
)(
)(
42/11
32/12/1
tvtrr
tFm
tvv
ni
ni
ni
ni
i
ni
ni
Application in Practice
2
2/12/1
2/11
2/12/1
ni
nin
i
ni
ni
ni
ni
i
ni
ni
vvv
vtrr
Fm
tvv
Integration Algorithms: Velocity Verlet
Discrete timeDiscrete time
rrn+1n+1rrnnrrn-1n-1rrn-2n-2
vvn+1n+1vvnnvvn-1n-1vvn-2n-2
f f n+1n+1f f nnf f n-1n-1f f n-2n-2
12/11
2/11
2/1
2
2
ni
i
ni
ni
ni
ni
ni
ni
i
ni
ni
Fm
tvv
vtrr
Fm
tvv
Application in Practice
)()(2
)(2
211
42
1
tFFm
tvv
tFm
tvtrr
ni
ni
i
ni
ni
ni
i
ni
ni
ni
Constraint Solvers
SHAKE
RATTLERATTLE_R (SHAKE)
Taylor expansions: 3
2
1 2tO
m
gftvtrr nnnnn
31 22
1 tOm
hftvv nnnn
jiij
uij
oij
uijijij
ij
oijijjiij
mm
dd
dd
tg
dgGG
111
)(
2
22
2
uij
oij
uij
oijij
ijdd
dd
tg
)(
22
2
RATTLE_Vijd
i
joiv
oiv
2
)(
ij
oijjiij
ij
oijijjiij
d
dvv
th
dhHH
oijd
ijd
uijd
oioj
ui i
ujj
jiG
ijG
MUMU11
MUMU22
MUMU33
MUMU44
Replicated Data SHAKE
Proc AProc A Proc BProc B
Extended Ensembles in VV casting
Velocity Verlet integration algorithms can be naturally derived from the non-commutable Liouvile evolution operator by using a second order Suzuki-Trotter expansion. Thus they are symplectic/true ensembles (with conserved quantities) warranting conservation of the phase-space volume, time-reversibility and long term numerical stability…
Examplary VV Expansion of NVE to NVEkin, NVT, NPT & NσT
ttttRRATTLE
tttvt
txttx
tm
tfttvttv
tttttThermostat
ttttBarostat
ttttThermostat
tftvtx
iii
i
iii
iii
:)(_
:)(2
)()(
:)(
2)()(
:)(
:)(
:)(
)(),(),(
:VV1
21
21
21
41
21
41
21
21
41
41
tttttThermostat
tttttBarostat
tttttThermostat
tttttVRATTLE
tm
ttftttvttv
afreshttfttvttx
i
iii
iii
41
23
21
21
41
43
21
21
21
21
21
:)(
:)(
:)(
:)(_
:)(
2)()(
)(),(),(
:VV2
Other Integration Algorithms
• Gear Predictor-Corrector – generally easily extendable to any high order of accuracy. It is used in satellite trajectory calculations/corrections. However, lacking long term stability.
• Trotter derived evolution algorithms – generally easily extendable to any high order of accuracy. Symplectic.
Base Functionality
• Molecular dynamics of polyatomic systems with options to save the micro evolution trajectory at regular intervals
• Optimisation by conjugate gradients method or zero Kelvin annealing
• Statistics of common thermodynamic properties (temperature, pressure, energy, enthalpy, volume) with options to specify collection intervals and stack size for production of rolling and final averages
• Calculation of RDFs and Z-density profiles• Temperature scaling, velocity re-Gaussing• Force capping in equilibration
• Radiation damage driven features:• defects analysis• boundary thermostats• volumetric expansion• replay history• variable time step algorithm
• Extra ensembles:• Langevin, Andersen, MTK, GST• extensions of NsT to NPnAT and NPnT
• Infrequent k-space Ewald evaluation• Direct VdW• Direct Metal• Force shifted VdW• I/O driven features Parallel I/O & netCDF• Extra Reporting
DL_POLY_4 Specials
Part 3Part 3
DL_POLY I/O FilesDL_POLY I/O Files
I/O Files
• Crystallographic (Dynamic) data• Simulation Control data• Molecular / Topological Data• Tabulated two-body potentials• EAM potential data• Reference data for DEFECTS
• Restart data
• Final configuration• Simulation summary data• Trajectory Data
• Defects data
• MSD & T inst data
• Statistics data• Best CGM configuration
• RDF data
• Z density data
• Restart data
Internally, DL_POLY uses atomic scale units:
• Mass- mass of H atom (D) [Daltons]• Charge - charge on proton (e)• Length - Angstroms (Å)• Time - picoseconds (ps)• Force - D Å ps-2
• Energy - D Å2 ps-2 [10 J mol-1]
pressure is expressed in k-atm for I/O
DL_POLY Units
UNITS directive in FIELD file allows to opt for the following energy units
•Internal DL_POLY units - 10 J mol-1 •Electron-volts - eV•Kilo calories per mol - k-cal mol-1
•Kilo Joules per mol - k-J mol-1
•Kelvin per Boltzmann - K Boltzmann-1
All interaction MUST have the same energy units!
Acceptable DL_POLY Units
• SIMULATION CONTROL
• Free Format• Mandatory• Driven by keywords:
keyword [options] {data}
e.g.:
ensemble NPT Hoover 1.0 8.0
CONTROL File
CONFIG [REVCON,CFGMIN] File
• Initial atomic coordinates• Format
- Integers (I10)- Reals (F20)- Names (A8)
• Mandatory• Units:
- Position - Angstroms (Å)- Velocity – Å ps-1
- Force - D Å ps-2
• Construction:- Some kind of GUI
essential for complex systems
• Force Field specification
• Mandatory• Format:
- Integers (I5)- Reals (F12)- Names (A8)- Keywords (A4)
• Maps on to CONFIG file structure
• Construction- Small systems - by hand- Large systems – nfold or
GUI!
FIELD File
• Defines non-analytic pair (vdw) potentials
• Format- Integers (I10)- Reals (F15)- Names (A8)
• Conditional, activated by FIELD file option
• Potential & Force • NB force (here) is:
)()( rUr
rrG
TABLE File
• Defines embedded atom potentials• Format
- Integers (I10)- Reals (F15)- Names (A8)
• Conditional, activated by FIELD file option• Potentials only • pair, embed & dens keywords for atom
types followed by data records (4 real numbers per record)
• Individual interpolation arrays
TABEAM File
• Provides program restart capability
• File is unformatted (not human readable)
• Contains thermodynamic accumulators, RDF data, MSD data and other checkpoint data
• REVIVE (output file) ---> REVOLD (input file)
REVOLD [REVIVE] File
• Provides Job Summary (mandatory!)• Formatted to be human readable• Contents:
- Summary of input data- Instantaneous thermodynamic data at selected intervals- Rolling averages of thermodynamic data- Statistical averages- Final configuration- Radial distribution data- Estimated mean-square displacements and 3D diffusion
coefficient• Plus:
- Timing data, CFG and relaxed shell model iteration data- Warning & Error reports
OUTPUT File
• System properties at intervals selected by user
• Optional• Formatted (I10,E14)• Intended use: statistical
analysis (e.g. error) and plotting vs. time.
• Recommend use with GUI!
• Header:- Title- Units
• Data:- Time step, time, #
entries- System data
STATIS File
Configuration data at user selected intervals•Formatted•Optional
Header:•Title•Data level, cell key, number
Configuration data:•Time step and data keys•Cell Matrix•Atom name, mass, charge•X,Y,Z coordinates (level 0)•X,Y,Z velocities (level 1)•X,Y,Z forces (level 2)
HISTORY File
• Formatted (A8,I10,E14)• Plotable• Optional• RDFs from pair forces• Header:
- Title- No. plots & length of plot
• RDF data:- Atom symbols (2)- Radius (A) & RDF- Repeated……
• ZDNDAT file has same format
RDFDAT [ZDNDAT] File
• REFERENCE file– Reference structure to compare against
• DEFECTS file– Trajectory file of vacancies and interstitials
migration
• MSDTMP file– Trajectory like file containing the each
particle’s Sqrt(MSDmean) and Tmean
DL_POLY_4 Extra Files
Part 5Part 5
DL_POLY_4 PerformanceDL_POLY_4 Performance
Proof of Concept on IBM p575 2005
300,763,000 NaCl with full SPME electrostatics evaluation on 1024 CPU cores
HECToR (2013)Start-up time 60 min 15 min Timestep time 68 sec 23 secFFT evaluation 55 sec 18 sec
In theory ,the system can be seen by the eye. Although you would need a very good microscope – the MD cell size for this system is 2μm along the side and as the wavelength of the visible light is 0.5μm so it should be theoretically possible.
2000 4000 6000 8000 10000 12000 14000 16000
2000
4000
6000
8000
10000
12000
14000
16000
14.6 million particle Gd2Zr
2O
7 system
Sp
ee
d G
ain
Processor count
Perfect MD step total Link cells van der Waals Ewald real Ewald k-space
Benchmarking BG/L Jülich 2007
0 200 400 600 800 1000
0
200
400
600
800
1000
max load 700'000 atoms per 1GB/CPUmax load 220'000 ions per 1GB/CPUmax load 210'000 ions per 1GB/CPU
Solid Ar (32'000 atoms per CPU) NaCl (27'000 ions per CPU) SPC Water (20'736 ions per CPU)
21 million atoms
28 million atoms
33 million atoms
Sp
ee
d G
ain
Processor Count
Weak Scaling
DL_POLY_4 RB v/s CB
HECToR (Cray XE6) 2013
Weak Scaling and Cost of Complexity
HECToR (Cray XE6) 2013
I/O Solutions
1.Serial read and write (sorted/unsorted) – where only a single MPI task, the master, handles it all and all the rest communicate in turn to or get broadcasted to while the master completes writing a configuration of the time evolution.
2.Parallel write via direct access or MPI-I/O (sorted/unsorted) – where ALL / SOME MPI tasks print in the same file in some orderly manner so (no overlapping occurs using Fortran direct access printing. However, it should be noted that the behaviour of this method is not defined by the Fortran standard, and in particular we have experienced problems when disk cache is not coherent with the memory).
3.Parallel read via MPI-I/O or Fortran
4.4.Serial NetCDF read and writeSerial NetCDF read and write using NetCDF libraries using NetCDF libraries for machine-independent data formats of array-based, for machine-independent data formats of array-based, scientific data (widely used by various scientific scientific data (widely used by various scientific communities).communities).
MX1-1
PX0-1
M1
P1
The Advanced Parallel I/O Strategy
MX0+1
PX0+1
MX1-1
PX1-1
M0
P0
MXn+1
PXn+1
MN-1
PN-1
MXn
PXn
MX0
PX0
I/O Group 0
I/O Group 1
I/O Group n=M-1
I/O
B
ATC
HI/
O
BA
TC
HI/
O
BA
TC
HD
I S
KD
I S
K
Memory
PHEAD Pslave I/O WRITE COMMS
I/O READ COMMS
HECToR (Cray XE6) 2012
•72 I/O NODES
•READ ~ 50-300 Mbyte/s with best performance on 16 to 128 I/O Groups
•WRITE ~ 50-150 Mbyte/s with best performance on 64 to 512 I/O Groups
•Performance depends on user defined number of I/O groups, and I/O batch (memory CPU to disk) and buffer (memory of comms transactions between CPUs)
•Reasonable defaults as a function of all MPI tasks are provided
N compute cores of which M < N N compute cores of which M < N do I/Odo I/O
Part 4Part 4
Obtaining & Building DL_POLYObtaining & Building DL_POLY
DL_POLY Licensing and Support
• Online Licence Facility at http://www.ccp5.ac.uk/DL_POLY/• The licence is
• To protect copyright of Daresbury Laboratory• To reserve commercial rights• To provide documentary evidence justifying continued support by UK Research Councils
• It covers only the DL_POLY_4 package• Registered users are entered on the DL_POLY e-mailing list• Support is available (under CCP5 and EPSRC service level agreement to CSED) only to UK academic researchers• For the rest of the world there is the online user forum• Last but not least there is a detailed, interactive, self-referencing PDF (LaTeX) user manual
• Register at http://www.ccp5.ac.uk/DL_POLY/
• Registration provides the decryption - procedure and password (sent by e-mail)
• Source is supplied by anonymous FTP (in an invisible directory)
• Source is in a tarred, gzipped and encrypted form
• Successful unpacking produces a unix directory structure
• Test and benchmarking data are also available
Supply of DL_POLY_4
DL_POLY_Classic Support
• Full documentation of software supplied with source• Support is available through the online user forum or the CCP5 user community
WWW:WWW:
http://www.ccp5.ac.uk/DL_POLY_CLASSIC/http://www.ccp5.ac.uk/DL_POLY_CLASSIC/
FTP:ftp://ftp.dl.ac.uk/ccp5/DL_POLY/
FORUM:http://www.cse.stfc.ac.uk/disco/forums/
ubbthreads.php/
• Downloads are available from CCPForge at http://ccpforge.cse.rl.ac.uk/gf/project/dl_poly_classic/
• No registration required – BSD licence
• Download source from: CCPForge: Projects: DL_POLY Classic: Files: dl_poly_classic: dl_poly_classic1.9
• Sources is a in tarred and gzipped form
• Successful unpacking produces a unix directory structure
• Test data are also available
Supply of DL_POLY_Classic
DL_POLYDL_POLY
buildbuild
sourcesource
executeexecute
javajava
utilityutility
Home of makefilesHome of makefiles
DL_POLY source codeDL_POLY source code
Home of executable Home of executable &&Working DirectoryWorking Directory
Java GUI source codeJava GUI source code
Utility codesUtility codes
data Test dataTest data
DL_POLY Directory Structure
1. Note differences in capabilities (e.g. linked rigid bodies) !!!2. Less than 10,000 atoms (if in parallel)? – DL_POLY Classic3. More than 30,000 atoms? – DL_POLY_44. Ratio cell_width/rcut < 3 (in any direction)? – DL_POLY_Classic
5. Less than 500 particles per processor? – DL_POLY_Classic
DL_POLY_C v/s DL_POLY_4 Usage
DL_POLY_ClassicSimple molecules (no SHAKE)• 8 or less, 10,000 atoms• 16 or less, 20,000 atoms• 32 or less, 30,000 atomsSimple ionics• 16 or less, 10,000 atoms• 64 or less, 20,000 atoms• 128 or less, 30,000 atomsMolecules (with SHAKE)• 64 max!
DL_POLY_4• Golden Rule 1: No fewer than
3x3x3 link cells per processor (if in parallel)
• Golden Rule 2: No fewer than 500 particles per processor (if in parallel)!
Part 6Part 6
DL_POLY_Classic FunctionalityDL_POLY_Classic FunctionalityW. SmithW. Smith
Special Algorithms
• Hyperdynamics– Bias potential dynamics– Temperature accelerated dynamics– Nudged elastic band
• Solvation properties:– Energy decomposition– Spectroscopic solvent shifts– Free energy of solution
• Metadynamics
Standard Input Special Input Standard Output Special Output
CONFIG REVOLD OUTPUT HISTORY
FIELD TABLE STATIS RDFDAT
CONTROL TABEAM REVIVE ZDNDAT
REVCON
HYPOLD HYPRES
EVENTS
CFGBSNnn
CFGTRAnn
PROnn.XY
SOLVAT
FREENG
STEINHARDT METADYNAMICS
ZETA
CFGMIN
Operation Type:
Standard use
Hyperdyn./TAD
Solvation
Metadynamics
Optimisation
I/O Files
Solvation Features
• Molecular Solvation Energy• Energy decomposition• Energy distribution functions
• Free Energy of Solvation• Mixed Hamiltonian method• Thermodynamic Integration
• Solution Spectroscopy• Solvent induced shifts• Solvation relaxation
• SOLVAT – Breakdown of system energy based on
molecular types– Energies of ground and excited states
• FREENG – Energy data for thermodynamic
integration
Solvation Files
Bias Potential Dynamics
Temperature Accelerated Dynamics
Metadynamics
Hyperdynamics
• Construct bias potential to reduce well depth of state A.
• Bias potential is zero at saddle point.
• Ratios of rates from state A to states B, C, etc. preserved:
• Suitable bias potential:
Bias Potential Dynamics
OriginalPotential
Bias Potential
ModifiedPotential
State A
Bias Potential Dynamics 2
b
b
bb
bb
b
b
A
Nb
ATSTTST
*bA
NbANTST
A
NbA
NbNTST
ANTST
A
Nb
A
Nb
N
A
NNb
Nb
N
NNb
Nb
NN
A
NN
NNN
A
RVkk
RVRVRVk
RVRVRVk
RVk
RV
RVff
dRVRVH
dRVRVHff
dH
dHff
exp/or
0 if exp/ and
exp/exp So
Now
exp
exp
exp
exp
exp
exp
*
*
*
Temperature Accelerated Dynamics
First order reactions:
Hopping probability:
P dt = k exp(-kt) dt
Lifetime of state: =1/k
Arrhenius:
k = A exp(-E*/RT)
log(1/) = log A - E*/RT 1/RTh 1/RTl1/RToo
log(1
/)
p1
p2
E1
E2
increasing simulationtime
stop time tend
Temperature Accelerated Dynamics 2
• Simulate system at high T & watch for transitions• When transition found, stop simulation and:
– Determine activation energy using nudged elastic band
– Record transition time, save `new’ state configuration
• Restart simulation in original state with new velocities.
• Search for new transitions. Hence build `library’ of transition data.
• Stop searching after time tend given by:
tend=exp[E2+(Th-Tl)(E2-)/Th]
• Commence new search from `first’ low T state.
Nudged Elastic Band
A
B
E
R
A B
C1 CN-1
C2 C3 C4
C0 CN
• N+1 configs (C0…CN) linearly interpolated From A to B
• Connect by spring (stiffness K)
• Remove `off tangent’ forces• Minimise all configs subject to presence of spring forces
• Resulting path is reaction path through saddle point
Kinetic Monte Carlo
• Simulate set of competing processes• Rate of process is (make a list).• Define sum of rates
• Generate random number• Select process
• Advance time • Repeat!
Nipi ,,1; ip ir
N
iirR
1
10 : uu
i
jj
i
jji ruRrp
1
1
1
:
R
uRip
Rut /)log(
Invoking the Hyperdynamics Options
In the CONTROL file:
tadunits kJnum_block 500num_track 10blackout 1000catch_radius 3.5neb_spring 10.0deltad 6.91low_temp 40.0force 0.0025endtad
bpd pathunits eVvmin -3.9935E03ebias -3.5000E03num_block 300num_track 10catch_radius 3.5neb_spring 1.0force 0.00025endbpd
OR
Additional files for TAD and Bias potential dynamics:
•HYPRES/HYPOLD – restart files•EVENTS –program activity report•CFGBSNnn – Basin CONFIG files (new states)•PROnn.XY – Reaction path profiles•CFGTRAnn – Tracking CONFIG files
Subdirectories required in execute directory: BASINS, PROFILES, TRACKS
Hyperdynamics Files
TAD – DL_POLY Test Case 32
• Atoms `hop’ into vacancies• Each vacancy has 12 nearest
neighbour atoms• So 12 possible escapes from PE basin• Use TAD to find them!• Use NEB to find activation energy• Extrapolate to low temperature for
low T rate• Put results into KMC simulation
255 L-J Argon atoms FCC crystal + 1 vacancy
EVENTS file extract:
Event nt Basins Nt E Time(ps) Extrap.(ps) Stop time(ps)TRA 38500 0 1 1 7.28338E+00 3.82250E+01 4.31244E+07 2.04398E+03TRA 55500 0 2 1 7.20808E+00 5.49650E+01 5.36891E+07 2.04398E+03TRA 127500 0 3 1 7.28160E+00 1.26145E+02 1.41830E+08 2.04398E+03TRA 750500 0 4 1 7.19597E+00 7.47515E+02 7.13444E+08 2.04398E+03
BPD – DL_POLY Test Case 33
998 NaCl ions rocksalt crystal + 2 vacancies
Event nt Basins Nt E Time(ps) Extrap.(ps)TRA 4500 0 1 1 6.74301E-01 4.39500E+00 7.34793E+03TRA 399300 1 2 1 1.11127E+00 3.99185E+02 6.45155E+05TRA 466500 2 3 1 6.57466E-01 4.66495E+02 7.53837E+05
EVENTS file extract:
• Overall neutral system• Ions `hop’ into vacancies• Escapes from PE basin unknown
(a priori)• Use BPD to find them!• Use NEB to find activation energy• Extrapolate hopping time for zero
bias• Put results into KMC simulation
NEB Reaction Profiles
Lennard Jones Argon
Sodium Chloride
Metadynamics
Metadynamics is a method devised by Alessandro Laio and Michele Parrinello for accelerating the exploration of a free energy landscape as the function of collective variables.
Method:• The system potential energy is augmented by a time-
dependent bias potential consisting of Gaussian functions of the collective variables
• The longer a simulation remains in a particular free energy minimum, the larger the bias potential becomes – thus forcing the system to seek out a new thermodynamic state.
• The accumulated bias potential provides a description of the free energy surface
Metadynamics in 1D
A. Laio & M. Parrinello, PNAS 99 (2002) 12562
Collective Variables?
A collective variable is a single number that defines an atomic structure (i.e. it is a function of ).
Most often they are called Order Parameters. Particular examples used in metadynamics are:
•The system potential energy:
•Simulation cell vectors:
•The Steinhardt order parameters:
•Tetrahedral order parameters:
and are maximum for particular structures.
Defining the bias potential in terms of order parameters
allows destabilization of particular structural phases.
)( NrUQ
Nr
Q
),,( cbah
Metadynamics Formulae
Order parameter vector:
System Hamiltonian:
Bias Potential:
and are chosen to `fill’ surface at acceptable rate
Force:
Free Energy Surface:
)(,),()( 1N
MNNM rsrsrs
]),([)(21
2
trsVrUm
pH NMN
N
i i
i
gN
k
Mk
MnM htssWtrsV1
22
2/)()(exp]),([
)()(1
Nji
M
j j
Nii
rss
VrUf
]),([lim
)( trsVt
sF NMMg
W h
• METADYNAMICS – Data defining the metadynamics hypersurface
• STEINHARDT – Defines the Steinhardt order parameters
• ZETA – Defines the tetrahedral order parameters
Metadynamics Files
Steinhardt Order Parameters
if 0
if 1)(
)(cos
2
1
if 1
)(
and and typesatom connecting vectors allover runs where
,)(
with, and typesatomfor
1
12
4
2
2112
1
1
1
2/12
rr
rrrrr
rr
rr
rf
Nb
YrfQ
QNN
Q
c
b
bbmb
N
bcm
mm
C
b
) typeof are atoms all (assuming atom
tolinked atoms of pairs ofnumber theis and
species of atoms allover run ,, Where
)3/1)(cos()(1
1
2
c
N
i
N
ij
N
jkjikikcijc
c
N
Nkji
rfrfNN
T
Tetrahedral Order Parameters
Ice Nucleation and Growth 1
0.5ns
D Quigley and PM Rodger, Molec. Sim. 35 (2009) 613
Bias:Q4
OO, Q6OO, T & PE
Ice Nucleation and Growth 2
0.75ns
D Quigley and PM Rodger, Molec. Sim. 35 (2009) 613
Bias:Q4
OO, Q6OO, T & PE
Ice Nucleation and Growth 3
1.25ns
D Quigley and PM Rodger, Molec. Sim. 35 (2009) 613
Bias:Q4
OO, Q6OO, T & PE
Ice Nucleation and Growth 4
1.5ns
D Quigley and PM Rodger, Molec. Sim. 35 (2009) 613
Bias:Q4
OO, Q6OO, T & PE
Conclusions
• DL_POLY Classic is free
• It's very versatile with advanced features
• Go get it!
Part 7Part 7
The DL_POLY Java GUIThe DL_POLY Java GUIW. SmithW. Smith
• Java - Free!
• Facilitate use of code
• Selection of options (control of capability)
• Construct (model) input files
• Control of job submission
• Analysis of output
• Portable and easily extended by user
GUI Overview
• Edit source in java directory
• Edit using vi,emacs, whatever
• Compile in java directory:
javac *.java
jar -cfm GUI.jar manifesto *.class
• Executable is GUI.jar
• But.....
****Don't Panic!****
The GUI.jar file is provided in the download
Compiling/Editing the GUI
• Invoke the GUI from within the execute directory (or equivalent):
java -jar ../java/GUI.jar
• Colour scheme options:
java -jar ../java/GUI.jar –colourscheme
with colourscheme one of:monet, vangoch, picasso, cezanne, mondrian
(default picasso).
Invoking the GUI
MenusMenus
The Monitor Window
Using Menus
ShowShowEditorEditorOptionOption
GraphicsGraphicsButtonsButtons
The Molecular Viewer
GraphicsGraphicsWindowWindow
EditorEditorButtonButton
EditorEditorButtonsButtons
The Molecular Editor
EditorEditorWindowWindow
• File - Simple file manipulation, exit etc.• FileMaker - make input files:
– CONTROL, FIELD, CONFIG, TABLE
• Execute– Select/store input files, run job
• Analysis– Static,
dynamic,statistics,viewing,plotting• Information
– Licence, Force Field files, disclaimers etc.
Available Menus
ButtonsButtons
Text BoxesText Boxes
A Typical GUI Panel
• VMD is a free software package for visualising MD data.
• Website: http://www.ks.uiuc.edu/Research/vmd/
• Useful for viewing snapshots and movies.– A plug in is available for DL_POLY HISTORY files– Otherwise convert HISTORY to XYZ or PDB format
DL_POLY & VMD
xyzPDB
DL_FIELD
‘black box’FIELD CONFIG
DL_FIELD – http://www.ccp5.ac.uk/DL_FIELD/
• Orgainic Fields – AMBER, CHARM, OPLS-AA, PCFF, Drieding, CHARM19 (united atom)
• Inorganic Fields including a core-shell polarisation option• Solvation Features, Auto-CONNECT feature for mapping
complex random structures such as gels and random polymers
• input units freedom and molecular rigidification
Protonated 4382 atoms (excluding water)19400 bond interactions 7993 angles interactions13000 dihedral interactions 730 VDW intearctions
Developed by Developed by C.W. YongC.W. Yong
SOD
DL_POLY Hands-OnDL_POLY Hands-On
http://www.ccp5.ac.uk/DL_POLY/TUTORIAL/EXERCISES/http://www.ccp5.ac.uk/DL_POLY/TUTORIAL/EXERCISES/index.htmlindex.html
UV
k
kq ik rc
o kj j
j
N
1
2
42 2
20 1
2
exp( / )
exp
1
4
4
0
3 22
1
o
n j
jnn j
N
Rjn
oj
j
N
q q
R rerfc R r
q
/
k
Vm n 2
1 3
/ ( , , )with:with:
The Ewald Summation
Parallelised Ewald Summation
• Self interaction correction - as is• Real Space terms:
– Handle using parallel Verlet neighbour list– For excluded atom pairs replace erfc by -erf
• Reciprocal Space Terms:– Distribute over atoms for traditional RD
parallelisation; or– Use a grid method such as SPME (by taking
advantage of data distribution). This can be especially advantageous for DD parallelisation!
q ik rj jj N N P
N
exp/
1
q ik rj jj
N P
exp/
1
q ik rj jj N P
N P
exp/
( / )
1
2
q ik rj jj N P
N P
exp( / )
/ )
2 1
3(
q ik rj jj N P
N P
exp/ )
( / )
3( 1
4
q ik rj jj
Nexp
1
2
Partition overPartition overatoms:atoms:
Add to EwaldAdd to Ewaldsum on allsum on allprocessorsprocessors
Global SumGlobal Sum
exp( / )k
k
2 2
2
4
ppPP
pp33
pp22
pp11
pp00
Repeat for eachRepeat for eachk vectork vector
Strategy in Replicated Data
2
102
22
exp)4/exp(
2
1
N
jjj
korecip rkiq
k
k
VU
The crucial part of the SPME method is the conversion The crucial part of the SPME method is the conversion of the Reciprocal Space component of the Ewald sum of the Reciprocal Space component of the Ewald sum into a form suitable for Fast Fourier Transforms (FFT).into a form suitable for Fast Fourier Transforms (FFT).Thus:Thus:
),,(),,(2
1321
,,321
321
kkkQkkkGV
Ukkk
T
orecip
becomes:becomes:
wherewhere G and Q are 3D grid arrays (see later)G and Q are 3D grid arrays (see later)
Ref: Essmann Ref: Essmann et alet al., J. Chem. Phys. (1995) ., J. Chem. Phys. (1995) 103103 8577 8577
Smoothed Particle-Mesh Ewald
Central idea - share discrete charges on 3D grid:Central idea - share discrete charges on 3D grid:
Cardinal B-Splines MCardinal B-Splines Mnn(u) - in 1D:(u) - in 1D:
)1(1
)(1
)(
)0,max()!(!
!)1(
)!1(
1)(
/2exp)1()/)1(2exp()(
/2exp)()(/2exp
11
1
0
12
0
uMn
unuM
n
uuM
kuknk
n
nuM
KikMKknikb
KikuMkbLkiu
nnn
nn
k
kn
n
n
jnj
RecursionRecursionrelationrelation
SPME: Spline Scheme
321 ,,333322221111
1
321
)()()(
),,(
nnnjnjnjn
N
jj KnuMKnuMKnuMq
Q
GGT T (k(k11,k,k22,k,k33) is the discrete Fourier Transform of the) is the discrete Fourier Transform of thefunction:function:
*3213212
22
321 )),,()(,,()4/exp(
),,( kkkQkkkBk
kkkkG T
2
33
2
22
2
11321 )()()(),,( kbkbkbkkkB withwith
Is the charge array and QIs the charge array and QTT(k(k11,k,k22,k,k33) its discrete Fourier transform.) its discrete Fourier transform.
SPME: Building the Arrays
• SPME is generally faster then conventional Ewald SPME is generally faster then conventional Ewald sum in most applications. Algorithm scales as sum in most applications. Algorithm scales as O(NO(NloglogN)N)
• In DL_POLY_Classic the FFT array is built in pieces In DL_POLY_Classic the FFT array is built in pieces on each processor and made whole by a global sum on each processor and made whole by a global sum for the FFT operationfor the FFT operation
• In DL_POLY_4 the FFT array is built in pieces on In DL_POLY_4 the FFT array is built in pieces on each processor and kept that way for the distributed each processor and kept that way for the distributed FFT operation (DaFT)FFT operation (DaFT)
• The DaFT FFT `hides’ all the implicit The DaFT FFT `hides’ all the implicit communicationscommunications
SPME: Comments
FFTs areFFTs are
• Fast (!) - O(VlogV) operations where V is the Fast (!) - O(VlogV) operations where V is the number of points in the gridnumber of points in the grid
• Global operations - to perform a FFT you need all Global operations - to perform a FFT you need all the pointsthe points
• This makes it difficult to write an efficient, good This makes it difficult to write an efficient, good scaling FFTscaling FFT
Parallel FFTs - The Basics
• Distribute the data by planesDistribute the data by planes
• Each processor has a complete set of points in the Each processor has a complete set of points in the xx and and yy directions so can do those Fourier transforms directions so can do those Fourier transforms
• Redistribute data so that a processor holds all the Redistribute data so that a processor holds all the points in points in zz
• Do the z transformsDo the z transforms
• Allows efficient implementation of the serial FFTs Allows efficient implementation of the serial FFTs ( use a library routine )( use a library routine )
• In practice for large enough 3D FFTs can scale In practice for large enough 3D FFTs can scale reasonablyreasonably
• However, the distribution does not map onto However, the distribution does not map onto DL_POLY’s - large amounts of data redistributionDL_POLY’s - large amounts of data redistribution
Traditional Parallel FFTs
• Takes data distributed as in DL_POLY_4 (Domain Takes data distributed as in DL_POLY_4 (Domain Decomposition) thus avoiding a lot of Decomposition) thus avoiding a lot of communication in comparison to traditional 3D FFT communication in comparison to traditional 3D FFT routines (due to data redistribution) when used on routines (due to data redistribution) when used on more than 8-64 CPUsmore than 8-64 CPUs
• So do a distributed data FFT in the So do a distributed data FFT in the xx direction then direction then the the y y and finally the and finally the zz directions directions
• Disadvantage is that can not use the library routine Disadvantage is that can not use the library routine for the 1D FFT ( not quite true … )for the 1D FFT ( not quite true … )
• Scales quite well - e.g. on 512 procs, an 8x8x8 CPU Scales quite well - e.g. on 512 procs, an 8x8x8 CPU grid, a 1D FFT need only scale to 8 CPUgrid, a 1D FFT need only scale to 8 CPU
• Totally avoids data redistribution, exists as a stand Totally avoids data redistribution, exists as a stand alone F90 library.alone F90 library.
Daresbury advanced Fourier Transforms
U. Essmann, L. Perera, M.L. Berkowtz, T. Darden, H. Lee, L.G. Pedersen, J. Chem. Phys., (1995), 103, 8577
1. Calculate self interaction correction2. Initialise FFT routine (FFT – 3D FFT)3. Calculate B-spline coefficients4. Convert atomic coordinates to scaled fractional units5. Construct B-splines6. Construct charge array Q7. Calculate FFT of Q array8. Construct array G9. Calculate FFT of G array10. Calculate net Coulombic energy11. Calculate atomic forces
RD Scheme for long-ranged part of SPME
U. Essmann, L. Perera, M.L. Berkowtz, T. Darden, H. Lee, L.G. Pedersen, J. Chem. Phys., 103, 8577 (1995)
1. Calculate self interaction correction2. Initialise FFT routine (FFT – IJB’s DaFT: 3M2 1D FFT)3. Calculate B-spline coefficients4. Convert atomic coordinates to scaled fractional units5. Construct B-splines6. Construct partial charge array Q7. Calculate FFT of Q array8. Construct partial array G9. Calculate FFT of G array10. Calculate net Coulombic energy11. Calculate atomic forces
I.J. Bush, I.T. Todorov, W. Smith, Comp. Phys. Commun., 175, 323 (2006)
DD Scheme for long-ranged part of SPME
Simplest option:
ewald precision [] (tolerance)
Advanced option (for anisotropic systems, particularly with slabs):
ewald [] [kmax1] [kmax2] [kmax3]
with
= exp[-( rcut)2]/rcut (solve for )
= exp[-(km/2)2]km2 (solve for km)
kmax n=Lnkm/2 , where Ln is the MD box width in n direction
Choosing Ewald Sum Parameters
Recommended