2. MIT 10.637 Lecture 5 Why molecular dynamics? Protein
folding: how proteins fold and misfold (Prof. Vijay Pande) Voelz,
Bowman, Beauchamp, Pande. JACS (2010).
3. MIT 10.637 Lecture 5 Molecular dynamics F = ma Classical
particles can be simulated by solving Newtons second equation: The
force is the derivative of the potential energy at position r: r is
a vector containing the coordinates for all particles in cartesian
coordinates (i.e. of length 3Natom). The potential energy V
function (for now) comes from our force field parameters.
4. MIT 10.637 Lecture 5 Structure of an MD code 1. Initialize
positions and velocities, temperature, density, etc. 2. Compute
forces 3. Integrate equations of motion 4. Move atoms 5. Repeat 2-4
until equilibrated (desired properties are stable, potential energy
and kinetic energy are stable). 6. Continue 2-4 as production run
-> collecting data to average over.
5. MIT 10.637 Lecture 5 Initialization Avoid random
initialization. Dont want energy divergence. Initial positions
generated from a structure avoid overly short distances between
molecules/inside molecules. Velocities start out small or zero. Can
slowly heat up the system, giving more and more temperature
(velocity) to the particles. Randomizing initial velocities
equipartition theorem relates temperature to the velocity. Choose a
random number from a uniform distribution, make sure the net
velocity results in a total momentum of zero, scale velocities
until we get a kinetic energy that matches the initial
temperature.
6. MIT 10.637 Lecture 5 Statistical mechanics: ensembles Ways
in which a fixed volume can be described with statistical
mechanics: Microcanonical ensemble: Fixed number of particles (N),
fixed energy (E) - NVE. Equal probability for each possible state
with that energy/composition. Canonical ensemble: Fixed composition
(N), in thermal equilibrium with a heat bath of a given temperature
(T). Energy can vary but same number of particles probability of a
state depends on its energy (origin of the Boltzmann distribution).
Grand canonical ensemble (mVT): Variable composition - thermal and
chemical equilibrium with a reservoir. Fixed temperature reservoir
with a chemical potential for each particle. States can vary energy
and number of particles. Macroscopic properties of these ensembles
can be calculated as weighted averages based on the partition
function.
7. MIT 10.637 Lecture 5 Ergodic hypothesis We assume the
average obtained by following a small number of particles over a
long time is the same as averaging over a large number of particles
for a short time. Time-averaging is equivalent to ensemble-
averaging. Or alternatively: no matter where a system is started it
can get to another point in phase space.
8. MIT 10.637 Lecture 5 Choosing an ensemble Ensemble menu:
Choose one from each row Particle number N Chemical potential m
Volume V Pressure P Energy E Temperature T Most common
combinations: Microcanonical ensemble (NVE): Conserves the total
energy , S has maximum in equilibrium state. Canonical ensemble
(NVT): Also called constant temperature molecular dynamics.
Requires thermostats for exchanging energy. A has minimum in
equilibrium. Isothermal-isobaric ensemble (NPT): Requires both a
thermostat and barostat, corresponds most closely to laboratory
conditions. G has minimum.
9. MIT 10.637 Lecture 5 Molecular dynamics Conformation Energy
In molecular dynamics we can sample parts of the potential energy
surface that are accessible with the energy supplied to the
system
10. MIT 10.637 Lecture 5 Integration Integration algorithms
need to be fast, require little memory. Should allow us to choose a
long timestep. Stay close to the exactly integrated trajectory.
Conserve momentum and energy Be time-reversible. Be straightforward
to implement.
11. MIT 10.637 Lecture 5 Typical timescales 10-15 femto 10-12
pico 10-9 nano 10-6 micro 10-3 milli 100 seconds Bond vibration
Bond Isomerization Water dynamics Helix forms Fast conformational
change long MD run where we need to be MD step where wed love to be
Slow conformational change Chemistry and protein dynamics occur on
a relatively slow timescale: The MD timestep is limited by the
highest frequency vibration in the system, typically to 1/10 of the
period of that vibration. X-H bonds are typically the highest
frequency vibration (3000 cm-1 with a period of 10 fs) and a
typical timestep in classical MD will be 1 fs.
12. MIT 10.637 Lecture 5 Choosing a time step Too short -
computation needlessly slow Too long - errors result from
approximations Just right - errors acceptable, maximum speed
13. MIT 10.637 Lecture 5 Euler method Taylor expansion for
particle position and velocity at time t+Dt with truncation after
first term: Recall a is from the forces.
14. MIT 10.637 Lecture 5 Euler method Problems persist with
this method: First order method, local error scales with square of
the timestep. Global errors are larger. Not time-reversible.
Sensitive, easy to make unstable.
15. MIT 10.637 Lecture 5 Leap-frog method This method minimizes
some of the error present in the Euler method by calculating
velocities at timestep offsets second order method. Step 1: Solve
for acceleration/forces Step 2: Update velocities Step 3: Update
positions Repeat
16. MIT 10.637 Lecture 5 The Verlet algorithm Taylor expansion
for particle position at time t+Dt: Taylor expansion for particle
position at time t-Dt: Add expressions: v a b ( or a)
17. MIT 10.637 Lecture 5 The Verlet algorithm Positions
evaluated: Approximation for the first timestep: Acceleration is
from potential: Advantages: Simple to program, conserves energy
(and time reversible). Disadvantages for Verlet algorithm:
differences between large numbers can lead to finite precision
issues, velocities would be calculated based on difference between
positions at t+dt vs t-dt (velocity extension) so dont know
instantaneous velocities/temperatures. Need new positions before
velocity.
18. MIT 10.637 Lecture 5 The Velocity Verlet algorithm Regular
Verlet has no explicit dependence on velocities, only on
acceleration would be better to depend on velocity. This is solved
with Velocity Verlet algorithm. Taylor expand position, velocity:
Taylor expand acceleration, then rearrange and multiply by
Dt/2:
19. MIT 10.637 Lecture 5 The Velocity Verlet algorithm
Substitute in expression for second derivative of velocity: We get
this expression, then simplify:
20. MIT 10.637 Lecture 5 The Velocity Verlet procedure Step 1:
Evaluate new positions Step 2: Evaluate forces (acceleration) at
t+Dt. Step 3: Evaluate new velocities Repeat procedure
21. MIT 10.637 Lecture 5 Predictor-corrector approach 1.
Predict r, v, and a at time t+Dt using second order Taylor
expansions. 2. Calculate forces (and accelerations) from new
positions r(t+Dt) 3. Calculate difference in predicted versus
actual accelerations: 4. Correct positions, velocities,
accelerations using new accelerations Da(t+Dt) 5. Repeat
22. MIT 10.637 Lecture 5 Updated formulas Coefficients chosen
to maximize stability of algorithm, e.g. Gear Predictor-Corrector
has c0=1/6 c1=5/6 c2=1 c3=1/3
23. MIT 10.637 Lecture 5 Pros and cons of predictor- corrector
Pros Positions and velocities are corrected to Dt4 Very accurate
for small Dt Cons Not time reversible Not symplectic (area/energy
preserving) Takes more time two force evaluations per step. High
memory requirements (15N instead of 9N).
24. MIT 10.637 Lecture 5 Use of constraints to increase the
integration step SHAKE algorithm fixes X-H bonds and allows
increase of timesteps from 1fs to 2fs. Also, hydrogen mass
repartitioning: take mass from neighboring atoms and increase mass
of hydrogen to ~4 au: timesteps ~4fs d Unconstrained update d
Project out forces along the bond l Correct for rotational
lengthening d p
25. MIT 10.637 Lecture 5 Lyapunov instability Trajectories are
sensitive to initial conditions! Position of Nth particle at time t
depends on initial position and momentum plus elapsed time:
Perturbing initial conditions of the momentum: Difference diverges
exponentially, l is the Lyapunov exponent.
26. MIT 10.637 Lecture 5 Lyapunov instability Example: two
particles out of 1000 in a Lennard-Jones simulation have velocities
in x-component changed by +10-10 and -10-10. Monitor the sum of the
squares of differences in positions of all particles: Gets very
large very quickly! (After only about 1000 steps).
27. MIT 10.637 Lecture 5 Periodic boundary conditions Can
simulate the condensed phase with a limited number of particles if
we use periodic boundary conditions. Needed to eliminate surface
effects Particle interacts with closest images of other molecules.
A number of options in AMBER: cubic box, truncated octahedron,
spherical cap. rcut < L/2
28. MIT 10.637 Lecture 5 Periodic boundary conditions van der
Waals interactions are usually treated with a finite distance
cutoff. Ewald summation treats long range electrostatics accurately
and efficiently using real space (short range) and reciprocal space
(long range but short range in inverse space) summations->
converges quickly. Particle Mesh Ewald uses FFT and converges O(N
log N). Choose a large enough simulation cell to avoid contact
between periodic images e.g. protein-protein interactions. Need
cutoffs of interactions to be no more than half the shortest box
dimension. Need to neutralize the simulation cell with
counter-ions. a b b Cutoff approaches (better than abrupt
truncation):
29. MIT 10.637 Lecture 5 Speeding up MD calculations Lookup
tables: pre-compute interaction energies at various distances and
interpolate to get value. Neighbor lists: lists of atoms to
calculate interactions for, then only update the list when atoms
move a certain distance (about every 10-20 timesteps for liquids,
infrequent for solids). Storage issues for very large systems.
Cell-index method: discretize simulation cell into sub-cells.
Search only the sub-cells within a certain distance (e.g. nearest
neighbors). Multiple timestep dynamics (e.g. Bernes RESPA method):
evaluate and update forces due to different interactions on
different timescales long range interactions like electrostatics
get updated most slowly, bond constants get updated most quickly.
Rigid bonds/mass repartitioning (covered earlier).
30. MIT 10.637 Lecture 5 Temperature in MD Equipartition energy
theorem relates temperature to the average kinetic energy of the
system. Instantaneous temperature is: Thermostats may be used to
control temperature (e.g. in NPT and NVT ensembles).
31. MIT 10.637 Lecture 5 Berendsen thermostat Suppresses
fluctuations in kinetic energy so not truly producing canonical
ensemble. If t is same as timestep, then simply velocity rescaling.
A form of velocity rescaling with weak coupling to an external
bath. Velocities get multiplied by a proportionality factor (l) to
move the temperature (T) closer to the set point (T0).
Proportionality factor: Revised equations of motion: Typically t =
0.1-0.4ps
32. MIT 10.637 Lecture 5 Andersen thermostat Correctly samples
NVT. Cannot be used to sample time-dependent properties e.g.
diffusion, hydrogen bond lifetimes. Each atom at each integration
step is subject to small, random probability of collision with a
heat bath. This is a stochastic process. Probability of a collision
event: For small timesteps, , and each particle is assigned a
random number between 0 and 1. If that number is smaller than then
the momentum of the particle is reset. New momentum follows a
Gaussian distribution around the set point temperature.
33. MIT 10.637 Lecture 5 Langevin dynamics In Langevin
dynamics, all particles experience a random force from particles
outside the simulation as well as a friction force that lowers
velocities. The friction force and random force are related in a
way that guarantees NVT statistics. Standard force Friction force
with coefficient g Random force with random number R(t) and related
to friction force through g. Recommended values for g are around
2-5 ps-1. Langevin is susceptible to synchronization artifacts so
its important to use a random seed when initializing velocities. In
some cases, Langevin can allow for longer time steps.
34. MIT 10.637 Lecture 5 Nose-Hoover thermostat Extended system
method: introduce additional artificial degrees of freedom and
mass: Stretched timescale Artificial mass Kinetic energy and
potential energy terms for heatbath degree of freedom (s). Sample
microcanonical ensemble in extended system variables, but there are
fluctuations of s, resulting in heat transfer between system and
bath sample canonical ensemble in real system.
35. MIT 10.637 Lecture 5 Thermostat review Thermostat
Description True NVT? Stochastic? Velocity rescaling/Berendse n KE
(velocities) revised to produce desired T No No Nose-Hoover Extra
degrees of freedom act as thermal reservoir Yes No Langevin Noise
and friction give correct T Yes Yes Andersen Momenta re-randomized
occasionally Yes Yes
36. MIT 10.637 Lecture 5 Pressure in MD Clausius virial
equation is used to obtain pressure from a molecular dynamics
system: where r is the position of particle i and F is the force.
Barostats may be used to control pressure (e.g. in NPT
ensemble).
37. MIT 10.637 Lecture 5 Berendsen barostat Used in Amber code
for NPT dynamics. Does not strictly sample from NPT ensemble.
Positions and volume are rescaled: Scaling factor: Where P0 is
target pressure and P is instantaneous pressure. t is the pressure
coupling time (typically 1-5 ps) and b is the isothermal
compressibility (44.6x10-6 bar-1 for water).
38. MIT 10.637 Lecture 5 Properties from MD runs
Autocorrelation functions: Autocorrelation functions (ACFs) can be
defined and calculated for any particle quantity (e.g. vi ) or any
system quantity (e.g. U, T, P, r). Starts at 1 and decays usually
exponentially with time. Diffusion coefficient: t(ps) Solid Liquid
0.0 t (ps)
39. MIT 10.637 Lecture 5 Properties from MD runs Radial
distribution function: g(r) separation (r) 1.0 R D R
40. MIT 10.637 Lecture 5 Summary Well-equilibrated molecular
dynamics gives us access to thermodynamic properties We need to
choose the right ensemble, thermostat/barostat, simulation cell,
timestep, cutoffs, force fields for the job. Direct, unbiased
molecular dynamics are limited to sampling the potential energy
surface weve given it enough energy to sample and by the timescale
accessible with the timestep weve selected. Hydrogens (flexible or
rigid) are the limiting factor in describing molecular dynamics of
organic systems. Adaptive sampling approaches are required to
efficiently sample rare events higher energy portions of the
potential energy surface, slower processes.