Upload
others
View
19
Download
2
Embed Size (px)
Citation preview
Molecular Dynamics Simulation:
Understanding DNA melting
Dhananjay Bhattacharyya
Computational Science Division
Saha Institute of Nuclear Physics, Kolkata
E-mail: [email protected]
A
G
T
CJ.D. Watson (Biologist)
and F. Crick (Physicist)
Netropsin like drugs bind in
the B-DNA narrow and deep
minor groove
Actinomycin D like drugs make
their place in between two
stacked base pairs by distorting
the DNA double helix
Open
Semi-Open
Close
HIV-1 Protease
in action
Interactions between Biological
Molecules
• Coulomb Interaction between unlike
charges F = QiQj/ Rij
• Shape complementarity – van der Waals
interaction
• Hydrogen bond
• Covalent bond Rare cases
D
H
A ABHA
R1
R2
),,(fr
R
r
Rd
r
qq)r(V
bondedHij
ijmin,
ij
ijmin,
ij
ij
ji
1012
van der Waals
Approximate Hydrogen Bond
van der Waals + Coulomb
Approximate H-Bond + Coulomb
Ener
gy
(kca
l/m
ol)
Hydrogen Bond
Quantum Mechanical Calculations for
Molecules
)()()()(8
2
2
2
rErrVrm
h
It is analytically solvable for three/four systems (V)
Molecules have complicated V(r)
Approximate solution of is possible
Approximations are of varied range:
a) Semi-empirical
b) Hartree-Fock
c) Density Functional Theory
d) Molar-Plesset Perturbation Theory, etc.
Quantum Mechanics
)()()*
2(
),(
),(),()],(2
[
222
2
'' 2
1 1
2
22
iii
ijiK
i
JK JK
JK
ij ij
n
i
N
K iK
KiKiiK
rErr
e
r
Ze
m
R
ZZ
r
e
r
ZeRrV
RrERrRrVm
e Z
Born-
Oppenheimer
Approximation
Ethane (three fold symmetry)
Ethylene (two fold symmetry)
)}3cos(1{kE
)}2cos(1{kE
)}1802cos(1{ okE
Electrostatic Interaction between polar atoms
through their atom-centered partial atomic charges
3324
coul
ij
ji
coul
ijo
jiK;
r
qqK
r
qqE
Bonds are also stretchable but at a cost of energy
Bond Breaking
energy
2
2
1)bb(KE obondbond
QM evaluation
Deformation from
equilibrium value costs
energy. Simplest form of
energy penalty is:
E k o
Bond Angle Deformation
Minimum Energy
at: Ro
66
126
5.012r
R
r
R
r
B
r
AE
ooo
Dispersion Interaction / Leanard-Jones
Potential / van der Waals Interaction
Classical Physics-based Force Field
202
b
pot bbkE
ncosVn12
612
0 2r
r
r
r oo
ijr
qqji
202
02
1
2
1ikikik rrFk
Classical Knowledge-based Force Field
0.00.30.40.10H(N)
2.4522.09441.0NC’C
1.2-0.53.20.42O
1.2-0.43.60.36N30.5NCC’N
1.20.53.60.40C’3-1.5C’NCC’
1.2-0.13.60.42C212.0CC’NC
0.10.13.40.05H31.2CCCC
qr*Nonbondedn½KAngle
Nonbonded ParametersTorsional Parameters
2.3902.09448.0HC’O0.980405.0NH
2.4102.09454.0C’NC1.457201.0CN
2.0272.09426.0C’NH1.200595.0C’O
2.5551.91118.0CCC1.278403.0C’N
2.2431.91125.0CCH1.490110.0CC
1.821.91140.0HCN1.099286.0CH
rik0½Fik
b0½KAnglesb0½KbBond
Angle Bending ParametersBond Stretching Parameters
Some Simplified Force Field Parameters for Hydrocarbon and Amides
From Warshel’s book
Force-fields:
a) AMBER
b) CHARMM
c) GROMOS
Softwares:
a) AMBER
b) CHARMM
c) GROMACS
d) NAMD
e) DESMOND …
RESI ALA 0.00
GROUP
ATOM N NH1 -0.47 ! |
ATOM HN H 0.31 ! HN-N
ATOM CA CT1 0.07 ! | HB1
ATOM HA HB 0.09 ! | /
GROUP ! HA-CA--CB-
HB2
ATOM CB CT3 -0.27 ! | \
ATOM HB1 HA 0.09 ! | HB3
ATOM HB2 HA 0.09 ! O=C
ATOM HB3 HA 0.09 ! |
GROUP !
ATOM C C 0.51
ATOM O O -0.51
BOND CB CA N HN N CA
BOND C CA C +N CA HA CB HB1 CB
HB2 CB HB3
DOUBLE O C
IMPR N -C CA HN C CA +N O
DONOR HN N
ACCEPTOR O C
Protein/position/species
independent residue
TOPOLOGY for Alanine
in
CHARMM format
ALANINE
ALA INT 1
CORR OMIT DU BEG
0.00000
1 DUMM DU M 0 -1 -2 0.000 0.000 0.000 0.00000
2 DUMM DU M 1 0 -1 1.449 0.000 0.000 0.00000
3 DUMM DU M 2 1 0 1.522 111.100 0.000 0.00000
4 N N M 3 2 1 1.335 116.600 180.000 -0.404773
5 H H E 4 3 2 1.010 119.800 0.000 0.294276
6 CA CT M 4 3 2 1.449 121.900 180.000 -0.027733
7 HA H1 E 6 4 3 1.090 109.500 300.000 0.120802
8 CB CT 3 6 4 3 1.525 111.100 60.000 -0.229951
9 HB1 HC E 8 6 4 1.090 109.500 60.000 0.077428
10 HB2 HC E 8 6 4 1.090 109.500 180.000 0.077428
11 HB3 HC E 8 6 4 1.090 109.500 300.000 0.077428
12 C C M 6 4 3 1.522 111.100 180.000 0.570224
13 O O E 12 6 4 1.229 120.500 0.000 -0.555129
TOPOLOGY for Alanine
in
AMBER format
[ ALA ]
[ atoms ]
N N -0.28000 0
H H 0.28000 0
CA CH1 0.00000 1
CB CH3 0.00000 1
C C 0.380 2
O O -0.380 2
[ bonds ]
N H gb_2
N CA gb_20
CA C gb_26
C O gb_4
C +N gb_9
CA CB gb_26
[ angles ]
; ai aj ak gromos type
-C N H ga_31
H N CA ga_17
-C N CA ga_30
N CA C ga_12
CA C +N ga_18
CA C O ga_29
O C +N ga_32
TOPOLOGY for Alanine
in
GROMOS/GROMACS format
E( x, y, z)
E( x+1, y, z)
E( x+2, y, z) …..
Energy Landscape
of typical bio-
molecules
Ener
gy
Positional Variables
kTE
kTEi
kTEE
i
i
if
e
eQQ
e~xp
xp withxQxpQ
Property Average
1 Available Methods:
Monte Carlo Simulation
Genetic Algorithm
Molecular Dynamics
Numerical Integration as Area
Covered by a Curve
01,0-1,0
1,1-1,1
Random Number
• Probability of finding its value
between xx+ x == that between
yy+ y
• Should not repeat
• Can be generated from mistake in
calculation, such as
IX = IM * M, where IM is largest Integer a
computer can store
rand=IX/IM
Probabilistic Method: Monte Carlo Metropolis
dxkT
xUexp
kTxU
exp
xP
dxxPxQQ
In a system having two states:
Pa/pb = exp(-(Ua-Ub)/kT) = Probability
of transition from b to a
Probability of transition
=1 if Ub > Ua
= exp(- U/kT)
A Bab
ba
Alw
ays
A
cce
pt
Reject
Accept
Ener
gy
Uniformly generated
Random numbers are used
to accept if
exp(- U/kT) > random no
and reject otherwise
Conformation 0: Calculate energy (Ei)
Alter conformation randomly
Calculate energy (Ei+1)
Calculate ρ = exp(-(Ei+1-Ei)/kT)
If ρ > random no
accept the conformation
Repeat the procedure
Periodic Boundary Condition
Energy Optimization
0dx
dE
x
Force = ─ Gradient of Potential Energy
...!3
)()(
!2
)()()()()(
3
3
32
2
2 xxf
dx
dxxf
dx
dxxf
dx
dxfxxf
ooo xxx
oo
Taylor’s Seriese
x
f(x)
xo
Multi-variable Optimization: NP-hard
Problem
• Systematic Grid Search procedure: Impossible, large no. variables
• Guided Grid Search: Depends on Choice
• Approximate Method based on Taylor series
...xdx
Ed
!xxx
dx
Edxxx
dx
dExdx
dEmmm 03
32002
2
002
10
1
2
2
0220dx
Ed
dx
dEx
dxEd
dxdExxm
...)x(''f!/h)x('hf)x(f)hx(f 22
Newton-Raphson
Method
Further approximations:
2. Steepest Descent
3. Conjugate Gradient
4. Truncated Newton-Raphson
1
2
3
2
32
2
13
232
2
2
2
2
31
231
2
21
2
2
1
2
0
dx
Ed
dxdx
Ed
dxdx
Ed
dxdx
Ed
dx
Ed
dxdx
Ed
dxdx
Ed
dxdx
Ed
dx
Ed
dx
dExxm
Energy Landscape
of typical bio-
molecules
Ener
gy
Positional Variables
kTE
kTEi
kTEE
i
i
if
e
eQQ
e~xp
xp withxQxpQ
Property Average
1 Available Methods:
Monte Carlo Simulation
Genetic Algorithm
Molecular Dynamics
Deterministic Method: Molecular Dynamics
2
2
dt
xdmamFxE ii
...t
dt
xdt
dt
xdtxttx
2
2
2
2
t
ttxttxtv
txFtttxtxttx
tdt
xdtt
dt
xdtt
dt
dxttxttx
tdt
xdtt
dt
xdtt
dt
dxttxttx
2
)()()(
))(()()(2)(
)(!3
)(!2
)()()(
)(!3
)(!2
)()()(
2
3
33
2
22
3
33
2
22
Verlet Algorithm:
t0-1/2 t t0+1/2 t t0+3/2 t t0+5/2 t t0+7/2 t
t0 t0+ t t0+2 t t0+3 t t0+4 t
X X X X XE
E
E
EE
E
E
E
E
E
v v v v
Leap-Frog Verlet Algorithm
NkTtvm
tm
tEttvttv
tttvtxttx
ii2
3
2
1
22
2
2
21
0 11T
Tt
vv
T
oldi
newi
Leapfrog Verlet Algorithm
Temperature and Pressure controlled by:
31
1 )PP(t
cp
xcx
o
oldp
new
Other Integrators:
1. Velocity Verlet
2. Langevin Dynamics
Simple Pendulum
Average Position of
a simple pendulum
12
3
4
5
Period of
measurement of
position : ~2.3 T
Recommended period of measurement
~T /10
Recommended dt for MD ~ 10-15sec
Heating phase
Equilibration
Solvated system
Temperature variation
Energy variation
Simulation in water in presence of counter-ions [Na+] to maintainelectroneutrality Periodic Boundary condition with Particle Mesh Ewald for long-range electrostatics Molecular Dynamics by NAMD in CHARMM27 force-field without anyconstraint using Constant Pressure-Temperature [CPT] Algorithm
Use of no constraint/restraint during the simulation allows us to
study the spontaneous evolution of the structure with time.
Molecular Dynamics Study: System Setup for Simulation
Duration of Simulation
• Protein Folding requires 1 s to 1ms
• Ligand binding/dissociation requires 1 s
• No. of steps = 1ms / t = 10-3s/10-15s = 1012
Need of faster computer
Engaging several computers in parallel
Increasing t by Shake, Rattle or Lincs algorithms
Special Decomposition in
Parallel Distributed Computer
CPU 1
CPU 0 CPU 2
CPU 3
Limitations and Further
Improvements
• Atomic movements Rescheduling tasks
• Network speed limitation
• Simulations using large no. of cores in GPU
• Replica Exchange Molecular Dynamics
• Steered Molecular Dynamics
Typical examples
• Our experiences in Simulation of DNA
double helix
• Understanding Sequence Directed DNA
melting thermodynamics
A
T
G
C
A-DNA
B-DNAZ-DNA
Adapted from: Leontis et al, NAR(2002), 30, 3497
Non-canonical Basepairing
Base Pair Finder: BPFIND
Took a base edge
Identify the H-bonding centers (N3G &
N2G)
Look for H-bond partner through distance
calculation (N6A & N7A)
Calculate pseudo-angles for planarity
Calculate E= i(di-3.0)2 + ½ k( k- i are
for two H-bond distances and k are for four
pseudo angles, for comparisons Gives rise to:
1822 A:U W-W(C);
6056 G:C W-W(C) and
847 G:U W-W(C) base pairs
Das, Mukherjee, Mitra & Bhattacharyya (2006) J Biomol Struct Dynam, 24, 149-
161
Structural Motifs of RNA:
G:C W:W C
A:U W:W C
G:U W:W C
A:G H: S T
A:U H:W T
A:A H:H T
G:A W:W C
G:A S:W T
A:A W:W T
A:U W:W T
A:A H: W T
A:U H:W C
G:G S:S T
G:G H:W T
A:C W:W T
C:U W:W T
A:C H:W T
G:G H:WC
G:C W:W T
A:G s:s T
AA HHT
AG SST
AG HST
AU HWT
AU HWT
Double Helical regions of RNA consists of Mainly Base
Pairs (G:C, A:U or G:U) and others
Containing
WC only
Loop residuesBase
paired
Base triples
Double Helices
Double helical fragments from ribosomal RNA
Sequence: 5’-GCAAACCGG-3’
3’-UGAUGGGCC-5’
Motif: A:A s:hT PDB: 1N32
A:U H:WT
A:G H:ST
Motif: G:A S:HT PDB: 2AW4
A:G H:ST
Sequence: 5’-GUGGGAGCACG-3’
3’-CGUCAGUGUGC-5’
A:A s:hT
A:U H:WT
A:G H:ST
Stability of Non-canonical Motifs: Open-angle
Variation
1N32: A:A s:hT A:U H:WT A:G H:ST
2AW4: G:A S:HT A:G H:ST
CHARMM AMBER
Non-Watson-Crick
Basepairs: R:R, Y:Y &
R:Y MismatchesBasepair PDB Sequence
R:R A:G W:WC 1FJG 5’-CGCCAUGG-3’
3’-GCGGGGUC-5’
Y:Y U:U W:WC 1J5A 5’-GUCUGGC-3’
3’-CGGUUCG-5’
R:Y G:U W:WC 1N33 5’-GGGCUCUACCC-3’
3’-CCCGGGAUGGG-5’
CHARMM AMBER
HD-RNAS: Hierarchical Database of RNA Structures
http://www.saha.ac.in/biop/www/HD-RNAS.html
Crystallographic Ensemble from Classification
1J5A: (35) 23S rRNA5’-GUCUGGC-3’
3’-CGGUUCG-5’
2AW4: (52) 23S rRNA5’-GUGGGAGCACG-3’
3’-CGUCAGUGUGC-5’
1FJG: (102) 16S rRNA5’-CGCCAUGG-3’
3’-GCGGGGUC-5’
1N32: (102) 16S rRNA5’-GCAAACCGG-3’
3’-UGAUGGGCC-5’
Summery
• Molecular Dynamics can break a structure
indicating instability of the conformation
• Molecular Dynamics results are compatible with
collective ensemble from crystal structure
database
• RNA double helices with tandem non-canonical
motifs are important for its structure
Thank YouDr. Debashree Bandyopadhyay
Sukanya Halder
Dr. Shayantani Mukherjee
Sanchita Mukherjee
Dr. Sudipta Samanta
Dr. M.C. Blaise
Dr. Geeta Kant
Dr. Sangeeta Kundu
DBT; CDAC; DAE
MD Simulation
Snapshots of
d(CGATTAATCG)2 at
300K
Mukherjee & Bhattacharyya (2012)
J. Biomol. Struct. Dynam.
Kundu, Mukherjee &
Bhattacharyya (2012) J. Biosci.
1EHV 249D 426D
DNA as base pair stacks
IUPAC-IUB
suggestion
RM
SD
Basepairs near terminal can
easily open up
Fraying effect
Non-cooperative melting
Mukherjee & Bhattacharyya JBSD 2012
Ends of the oligonucleotide
Expected DNA melting curve
indicating of cooperative
effect
Observed DNA melting curve for
Base Paired Double Stranded
oligonucleotides
Application: Prediction of Local
Melting in Polymeric DNA
Biological situation Most solution studies
http://www.saha.ac.in/biop/bioinformatics.html
Simulation of
Longer DNA
Stretch with
periodic
boundary
condition
DNA of sequence
d(CGCGCGCGAATTCGCGCGCG)2
Partial
opening after
~10 ns
Complete melting
after 30 ns at 500K
MD Simulations
at 460K
Base pair vs time plot at 460K
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
C1 G2 C3 G4 C5 G6 C7 G8 A9 A10 T11 T12 C13 G14 C15 G16 C17 G18 C19
Base pair step
Tim
e in
ns
Base pair vs step number at 400K
0
20
40
60
80
100
120
C1 G2 C3 G4 C5 G6 C7 G8 A9 A10 T11 T12 C13 G14 C15 G16 C17 G18 C19
Base pair step
Tim
e in
ns
Attempted Simulation of
Polymeric DNA
Samanta, Mukherjee, Chakrabarti & Bhattacharyya (2009) J. Chem. Phys. 130, 115103 (2009)
. . .|. . .
. . .|. . .
1 61 71 81 92 01 11 21 31 41 51 61 71 81 92 01 11 21 31 41 5
0 50 40 30 20 11 00 90 80 70 60 50 40 30 20 11 00 90 80 70 6
ttttt|T-TTTTTTTTTttttt
aaaaa|AAAAAAAAAAaaaaa
Structural properties in Polymeric
Simulation
Samanta, Mukherjee, Chakrabarti & Bhattacharyya (2009) J. Chem. Phys. 130, 115103 (2009)
Work in progress: Life time analysis at 460K of CG-polyL
ife
tim
e w
. r.
t R
oll
0
5
10
15
20
25
30
35
40
C1 G2 C3 G4 C5 G6 C7 G8 A9 A10 T11 T12 C13 G14 C15 G16 C17 G18 C19
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
C1 G2 C3 G4 C5 G6 C7 G8 A9 A10 T11 T12 C13 G14 C15 G16 C17 G18 C19
oligo
poly
MD Tutorial
1 1 200
db2000.dat
ALANINE
ALA INT 1
CORR NOMI DU BEG
0.00000
1 DUMM DU M 0 -1 -2 0.000 0.000 0.000
2 DUMM DU M 1 0 -1 1.449 0.000 0.000
3 DUMM DU M 2 1 0 1.522 111.100 0.000
4 N N3 M 3 2 1 1.335 116.600 180.000
5 H1 H E 4 3 2 1.010 130.000 0.000
6 H2 H E 4 3 2 1.010 60.000 90.000
7 H3 H E 4 3 2 1.010 60.000 -90.000
8 CA CT M 4 3 2 1.449 121.900 180.000
9 HA HP E 8 4 3 1.090 109.500 300.000
10 CB CT 3 8 4 3 1.525 111.100 60.000
11 HB1 HC E 10 8 4 1.090 109.500 60.000
12 HB2 HC E 10 8 4 1.090 109.500 180.000
13 HB3 HC E 10 8 4 1.090 109.500 300.000
14 C C M 8 4 3 1.522 111.100 180.000
15 O O E 14 8 4 1.229 120.500 0.000
CHARGE cc-pvtz esp iterated
.1902 .2056 .2056 .2056 .0040
.0430 -.0968 .0470 .0470 .0470
.6837 -.5819
IMPROPER
CA +M C O
DONE
GLYCINE
GLY INT 1
CORR NOMI DU BEG
0.00000
1 DUMM DU M 0 -1 -2 0.000 0.000 0.000
2 DUMM DU M 1 0 -1 1.449 0.000 0.000
AMBER Residue
Topology format
RESI ALA 0.00
GROUP
ATOM N NH1 -0.47 ! |
ATOM HN H 0.31 ! HN-N
ATOM CA CT1 0.07 ! | HB1
ATOM HA HB 0.09 ! | /
GROUP ! HA-CA--CB-HB2
ATOM CB CT3 -0.27 ! | \
ATOM HB1 HA 0.09 ! | HB3
ATOM HB2 HA 0.09 ! O=C
ATOM HB3 HA 0.09 ! |
GROUP !
ATOM C C 0.51
ATOM O O -0.51
BOND CB CA N HN N CA
BOND C CA C +N CA HA CB HB1 CB HB2 CB HB3
DOUBLE O C
IMPR N -C CA HN C CA +N O
CMAP -C N CA C N CA C +N
DONOR HN N
ACCEPTOR O C
IC -C CA *N HN 1.3551 126.4900 180.0000 115.4200 0.9996
IC -C N CA C 1.3551 126.4900 180.0000 114.4400 1.5390
IC N CA C +N 1.4592 114.4400 180.0000 116.8400 1.3558
IC +N CA *C O 1.3558 116.8400 180.0000 122.5200 1.2297
IC CA C +N +CA 1.5390 116.8400 180.0000 126.7700 1.4613
IC N C *CA CB 1.4592 114.4400 123.2300 111.0900 1.5461
IC N C *CA HA 1.4592 114.4400 -120.4500 106.3900 1.0840
IC C CA CB HB1 1.5390 111.0900 177.2500 109.6000 1.1109
IC HB1 CA *CB HB2 1.1109 109.6000 119.1300 111.0500 1.1119
IC HB1 CA *CB HB3 1.1109 109.6000 -119.5800 111.6100 1.1114
RESI ARG 1.00
GROUP
ATOM N NH1 -0.47 ! | HH11
ATOM HN H 0.31 ! HN-N |
ATOM CA CT1 0.07 ! | HB1 HG1 HD1 HE NH1-HH12
ATOM HA HB 0.09 ! | | | | | //(+)
GROUP ! HA-CA--CB--CG--CD--NE--CZ
CHARMM Residue
Topology
[ ALA ]
[ atoms ]
N N -0.28000 0
H H 0.28000 0
CA CH1 0.00000 1
CB CH3 0.00000 1
C C 0.380 2
O O -0.380 2
[ bonds ]
N H gb_2
N CA gb_20
CA C gb_26
C O gb_4
C +N gb_9
CA CB gb_26
[ angles ]
; ai aj ak gromos type
-C N H ga_31
H N CA ga_17
-C N CA ga_30
N CA C ga_12
CA C +N ga_18
CA C O ga_29
O C +N ga_32
N CA CB ga_12
C CA CB ga_12
[ impropers ]
; ai aj ak al gromos type
N -C CA H gi_1
C CA +N O gi_1
CA N C CB gi_2
[ dihedrals ]
; ai aj ak al gromos type
-CA -C N CA gd_4
-C N CA C gd_19
N CA C +N gd_20
[ ARG ]
[ atoms ]
N N -0.28000 0
H H 0.28000 0
CA CH1 0.00000 1
CB CH2 0.00000 1
CG CH2 0.00000 1
GROMACS Residue
Topology