16
Steps towards an Ensemble- Based Force Field Fitting Procedure… Dragos Horvath Dragos Horvath , Benjamin Parent, Guy Lippens , Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille UMR 8525 CNRS, Lille

Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Embed Size (px)

Citation preview

Page 1: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Steps towards an Ensemble-Based Force Field Fitting Procedure…

Dragos HorvathDragos Horvath, Benjamin Parent, Guy Lippens, Benjamin Parent, Guy Lippens

UMR 8525 CNRS, LilleUMR 8525 CNRS, Lille

Page 2: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Goal…

• To calibrate an empirical molecular force field for use in conformational sampling and docking:– Generally applicable to proteins, sugars, organic ligands

• Full atom simulations, no large protein folding

– Tailor-made for use with torsional degrees of freedom only!

• Continuum model for solvent effects!

– Consistent, in the sense that docking affinities & folding propensities should be directly linked to computed force field energies of sampled ensembles,

• no a posteriori rescoring of docking poses! Docking is just simultaneous conformational sampling of several molecules!

Page 3: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

The Prerequisite: an Exhaustive Conformational Sampling Tool!

• Based on a Genetic Algorithm, coding conformers as "chromosomes" in which each locus stands for a torsional angle value.

n…

• The In Silico Darwinian Evolution, leading to fitter and fitter (lower energy) conformers, was enhanced by – hybridization with various optimization heuristics

– Fine-tuning of the parameters controlling the evolutionary strategy

Generation of new offspring :

Mutation :

… n…i+1iWild type : … ni+1’i…mutant :

Crossover :

… n…i+1i

… ’n…’i+1’i’’

parent1 :

parent2 :

… ’n’i+1i…

… ni+1’i…’’

child1 :

child2 :

energies

intermediate population

... n

... n

... n

... n

... n

... n

... n

... n

random

... n

... n

... n

... n

initial population

sorted

next generation

... n

... n

... n

... n

sorted

Page 4: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Hybrid Heuristics: (1)-Targeted torsion angle choice!

Knowledge-based bias: favoring locally stable torsions…

polycycle : torsion nr. 1

0

0,05

0,1

0,15

0,2

0,25

0,3

0,35

0,4

0 100 200 300

angle

pro

bab

ilit

é

polycycle : torsion nr. 3

0

0,05

0,1

0,15

0,2

0,25

0,3

0,35

0,4

0 100 200 300

angle

pro

bab

ilit

ies

biased torsion probabilities thanks to learning

biased torsion probabilities wrt local Hamiltonian

"Traditionalism": favoring torsion values seen in previously visited samples

• Biasing the probabilities to draw a given value for a given angle (according to a temperature parameter):

Page 5: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Hybrid Heuristics: (2)-Directed Mutants or Explorers

• Torsional angle driving

Evolution stuck in local minima,

no mutationwould help

Adding aconstraint term,

Gradient optimizationin this new landscape

Final relaxation towardslocal minimum

"Explorer" launched in parallel in ordernot to halt the evolution process

Page 6: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Other Hybrid Heuristics - Automated Fragment Presampling, Taboo Search

• Taboo Search & Intrapopulation diversity control:– Discarding chromosomes that are too similar to fitter

conformers or to previously visited geometries

• Fragmentation: Sampling of energetically permitted geometries of fragments in presence of a buffer zone– allows the automated definition of "rotamer libraries" out of

which to pick geometries during global sampling!

0.0000

0.0500

0.1000

0.1500

0.2000

0.2500

0.3000

0.3500

Enhancement to Find Native Fragment Geometry in Initial Population

% F

rag

me

nts

Page 7: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Search for Optimal Sampling Setups in the Strategy Parameter Space…

p1 p2 p3 p4 p5 p6 p14 p15

Population management

Population size

Number of parallel process

Migration rate between ‘islands’

Evolution management

Crossover rate

Mutation rate

One/two point crossover rate

Selection pressure

Dissimilarity limit

Maximal age

Convergence management

Apocalypse (population reset) frequency

Elitism

Global stop condition

CPUtimeTk

ETkFitness

b

ib .expln._

minimafound

Page 8: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

3-fold repeat

Postprocessing…

Run 1

Run 2

Runn

Global Base of

Diverse Conformers

Base of diverse conformers[sampled at current setup]

µ-Fitness!!

Meta-algorithm defines parameter setupMeta-algorithm defines parameter setup

NewsNews????

« Taboos »« Tradition »

Meta-GA picksMeta-GA picksnext set of next set of

configurationsconfigurations

yes

GAME

OVER

no

Explorer

Sampling Engine Overview

Page 9: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Conformational sampling with an optimally tuned GA is (reproducibly !)

more efficient than a randomly parameterized simulation

linear peptide

190

195

200

205

210

215

220

225

Nr. of the parameters setting

Fre

e en

ergy

of t

he p

opul

atio

n

Optimizedparameters

Randomparameters

Page 10: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Impact of the hybrid heuristics on the sampling of cyclodextrine…

0

5

10

15

20

25

30

35

1 10 100

Deepest Energy well (kcal)

Nr.

of

dive

rse

conf

orm

ers

with

in +

20

kcal

.

from

be

st m

inim

um

Default No Exploring No Taboos Flat distribution Preference for locally stable torsions

Page 11: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Wanted: *structured* compounds with ~100 torsional degrees of freedom!

• Unfortunately, small molecules showing significant structuring in water (due to weak non-covalent interactions) are rare…– The "Trp cage" peptide 1L2Y (helix & turn, 20 AA)– The "Trp zipper" peptide 1LE1 (-sheet, 13 AA)– Designed minimalist -sheet peptide 1UAO (10 AA)– The WW domain of PIN 1 (34 AA, mostly -sheet)– Conformationally Restrained Helical peptide (CRH) with a

chemically engineered helix inducer group (21 AA)– Cyclodextrine (with "opened" rings!)– Protein-ligand complexes to be used as soon as the

docking module is developed !

Page 12: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Force Fields: What's wrong with existing ones?

• Heisenberg's Frustration Principle applies:– (FF Inaccuracy)X(Chance of "Missing Parameters Error") >> ħ

• Most were fitted with respect to few key points of the energy-geometry landscape, around which molecular dynamics simulations were supposed to gravitate…– … but sampling methods that facilitate barrier crossings may

discover deeper artefactual minima elsewhere!

• Ignoring valence angle flexibility requires some additional "fuzziness" of force field terms, to "accommodate" imprecise interatomic distances…

Page 13: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Considered Force Field terms

• Customized CVFF force field, employing:– a 10 Å cutoff (with a termination function)– a smoothing procedure of interatomic clash

contributions– a continuum solvent model

Effective interatomic distance d0ij

‘Sm

ooth

ing’

dis

tanc

e d i

j

2*4 ijdd

jicoulomb dE

jikd

VQVQkE hphob

h

ji

ijjisolvSolv ,4

,

22

Page 14: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

The Force Field Fitting Procedure…

Install a NEW FF parameter configuration

For each training molecule

Locally explore neighborhood of experimental geometry

Run GA-drivenExhaustive

Sampler

Add all sampled conformers toData Base & calculate RMS

Deviation from "native" geometry

Recalculate energies of stored conformers according to current FF setupCalculate Folding G according to chosen RMS radius

All G <0?

Yes, for the first time!

OK!

Yes, reconfirmed!

NO!

Distance-dependent dielectric constant Weighing factor of the desolvation penalty Weighing factor of the hydrophobic contacts Weighing factor of repulsive van der Waals Attractive & repulsive van der Waals coefficients of the following type:

'co' (carbonyl C), 'o' (ether-type O), 'h' (aliphatic H), 'cp' (aromatic C), 'oc' (carbonyl O)

jstatesmisfoldedj

istatesfoldedwelli

E

E

G)exp(

)exp(

ln1

RMS deviationRMS deviationfrom nativefrom native

Page 15: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Status Quo – after eight iterations in force field parameter space…

• Compounds for which correctly folded conformers were sampled, but misfolded conformers of lower energy were also found!

• Molecules for which the correctly folded conformers were never sampled• the WW domain of PIN-1 (34 residues)the WW domain of PIN-1 (34 residues)• the 1LE1 ‘Tryptophane Zipper’ mini-protein (13 residues)the 1LE1 ‘Tryptophane Zipper’ mini-protein (13 residues)

• Compounds for which experimental confor-mations are being sampled and ranked among the energetically most stable:

Cyclodextrine (open rings)Conformationally restrained helical peptide

Tryptophane cage (1L2Y)

Page 16: Steps towards an Ensemble-Based Force Field Fitting Procedure… Dragos Horvath, Benjamin Parent, Guy Lippens UMR 8525 CNRS, Lille

Conclusions…

• This is a coherent approach to simultaneously evolve a conformational sampling and docking engine, together with its underlying force field– Both the ability to find the minima and the quality of the energy

landscape are paramount in ensuring that the herein defined measures of free energy will be physically relevant…

– Will the resulting molecular force field be more "sampling-friendly" (with funnel-like landscapes?)

• At this point, it is unclear how quickly – if ever – it will converge, but it is well suited for GRID computations (deployment in progress).

• A genetic algorithm reproducibly finding a significant low-energy representative for each populated energy minimum cannot be envisaged without help from other minimum search heuristics… TTHANKSHTHANKSA THANKSNTHANKSKTHANKSS