Upload
gerry-evans
View
331
Download
1
Embed Size (px)
Citation preview
Boltzmann Fragment Maps From GC Monte Carlo Simulations: Hit Prioritization, Lead Optimization
MetaLeaps, LLC January, 2016
Definition: Boltzmann Maps
Distributions of chemical fragment or water binding configurations on the surface of macromolecules, with populations adhering to Boltzmann energy statistics
2
Structured water on DNA High affinity fragment cluster on the EPO receptor
Pyrimidine map for beta catenin
Point of view
• Boltzmann maps are useful for a variety of applications in drug discovery and molecular biology research
• Adequate prediction of differences in binding free energy, between different binding sites and different fragment types, is essential for utility
• Maps must be easy to use, or they won’t be
3 P53-MDM2 Hot Spots Differential binding between iso-forms
Non-obvious ideas for lead op, water penalty
Boltzmann Maps from Grand Canonica Monte Carlo (GC/MC) Simulations
• What are they?
• Why are they important?
• How are they used?
4
Thermodynamically-principled modeling of the configuration and energetics of water and chemical fragment binding, with better sampling
Common preconceived notions
• This must be some form of docking
• Calculated binding affinity is not predictive
• Simulations, modeling tools are too complex to use
5
Molecular interactions are a statistical phenomenon
• Boltzmann statistics required to accurately characterize ligand binding to proteins
• Docking is scanning for binding poses that “stick” – But no statistically-valid distributions of poses
• Simulations (molecular dynamics, Monte Carlo) produce rigorous Boltzmann statistics – But vary in the adequacy of sampling, practicality
GC/MC simulations are not docking! Docking sampling predicts binding poses but can not, in principle, accurately predict binding affinity*
– Warren et al., J. Med. Chem. 2006, 49, 5912-5931
6 *But docking is useful for some tasks (e.g. virtual screening)!
Calculating binding affinity is hard, so invert the problem
• Conventional methods involve summing contributions from many configuration samples – Adequate sampling of the binding configurations a key limitation
• Many examples in the literature of successful predictions of binding free energy – But the computations are ponderous, often of limited generality
In GC/MC, chemical potential* is imposed, and the configurations with a given binding affinity are discovered
– See also paper from the Essex group at U. Southhampton regarding calculating binding affinity directly from GC/MC data**
7
*Average free energy per molecule **“Water Sites, Networks, And Free Energies with Grand Canonical Monte Carlo”, G. A. Ross, M. S. Bodnarchuk, and J. W. Essex, JACS 2015 137 (47), 14930-14943
GC/MC efficiently provides information about fragment or water binding, ranked by affinity
• Fragment maps – Impose chemical potential (average FE/molecule) – Find where fragments will bind on a protein – Challenge is to adequately sample all possibilities
• Predictive ranking of fragment binding – Required for practical use of fragment maps – Lowest chemical potential at which a fragment
survives at a site is used for ranking – To be predictive, must account for at least
configurational entropy and ΔΔGs
• Statistically-significant fragment distributions – Finding appropriate geometries for linking – Understand flexibility vs. affinity – Complements single-pose X-ray co-crystals
Lower ΔH, Smaller ΔS
ΔG = ΔH-TΔS
Greater ΔH, Greater ΔS
Same ΔG with different ΔH, ΔS
Protein flexibility is important but that doesn’t mean all degrees of freedom are treated the same
• Protein flexibility can be factored into 4 spatial scales – Lobes, loops, side-chains, protonation states
• Fragment binding doesn’t generally impact lobes, loops – Simulate multiple X-ray structures or MD consensus structures
• Side-chain rotamer sampling is valuable (not done yet) – Only in the binding site, important in only 10-20% of cases
• Use co-crystal structures for chemotype substitution • Use MD simulations on larger assembled ligands to
evaluate binding (QM/MD even better) – Especially important for protein-protein/protein-DNA inhibitors – Protein-ligand binding of larger ligands is an emergent
phenomenon, not simply additive of component binding 9
Constrained fragment annealing (CFA) with bound waters to evaluate binding of assembled ligands
10
Anneal fragment subject to bond constraints to other fragments. Both can move, rotate. Ranks contri-bution of added fragment
Useful for prioritizing chemotype substitutions.
Tightly-bound waters
-5
-4
-3
-2
-1
0 -50 -40 -30 -20 -10 0
Expe
rimen
tal p
IC50
Predicted Free Energy
Top 5 compounds by IC50
Top 5 compounds predicted by analysis
In a blinded test with big pharma, correctly ranked 87% of predicted binding affinities using CFA
Side-chain conformation changes were an issue for some compounds
Ranking adequate for chemistry decisions
Summary of the SAMPL3 Challenge data hosted by OpenEye summer 2011*
R²=0.6651
R²=0.7518
-32
-27
-22
-17
-12
-7
-6.5 -6 -5.5 -5 -4.5
Calculated
Affinity(k
cal/mol)
ExperimentalAffinity(kcal/mol)
NoCorrec8on
Solva8onCorrec8on
FragMap SFA** CFA
• Solva8oncorrec8onimprovesaffinitypredic8on• CFAupgradesposepredic8on
*Kulp III, J. L.; et al. J Comput.er-Aided Mol. Des. , 2012, 26(5), 583-594. **SFA = Single Fragment Annealing of rigid fragment combinations
12
Constrained Fragment Annealing: Improves predictability, range, and accuracy
• P38 compound ranking using CFA
R² = 0.86716
5
6
7
8
9
-25 -20 -15 -10 -5 0 5
-pIC
50
Calculated relative free energy
13
Water Modeling: Comparing GC Monte Carlo vs. Molecular Dynamics
GC/MC+Annealing MD (Gromacs)
Lowest energy, multi-body configuration Not found in 10ns MD runs
Also, not found without chemical potential annealing in GC/MC
14
GC/MC with simulated annealing of chemical potential is an efficient and accurate water free energy modeling technique.
Finds multi-body water configurations not found with other water mapping methods, especially important for nucleic acids
We believe we have the highest performance, most automated GC/MC simulation platform
• Multi-variate (T, µ, V, P, σ, ε, ρ, ΔΔGs, ...), adaptive annealing schedules – Tool box to solve difficult sampling problems – Highly efficient, puts computation where it matters
• Learned-bias sampling strategies – Exponential acceleration of convergence – Surface insertion bias, location bias, flip sampling, etc.
• Multi-species GC/MC – Waters, metal ions, multi-point ion models, co-factors, etc.
• More accurate electrostatics – Charge factoring, ion pairing, energy-based cutoffs, optimized partial chgs
• High performance ΔΔGs model – Integrated with GC/MC sampling
• Optimized for Intel architectures, clusters – Multi-threading, SSE-optimized energy calculations
• Comprehensive support for a broad range of macromolecule structures, small molecule chemistries, ions, co-factors, etc. 15
A wide diversity of targets have been modeled: The method, tools are general (~50,000 maps)
• Kinases – P38 (3 variants) – Proprietary kinases (3 proteins) – Ckit – PhoQ Histidine kinase – JAK2/JAK3 – Mapkap-k2 (5 variants) – cAbl (2 variants)
• Proteases and hydrolytic enzymes – Elastase: PPE, HNE serine proteases – Peptide deformylase (2 variants) – Renin – T4 lysozyme – Peptidyl t-RNA hydrolase – Trypsin – HEWL lysozyme
• Nuclear Hormone Receptors – ROR-alpha – RAR-beta – LXR
• Transferases – Accase – Amino transferase – Phenylethanolamine methyltransferase
• Oxygenases/Reductases – Dihydrofolate reductase (5 types) – Cox1/Cox2 – IDO – CpI hydrogenase
• Receptors – EPO receptor – NOGO – GPCR
• Macromolecular Interactions – PCSK9 – BPTI (trypsin proteinase inhibitor) – Fcrn (peptide mimetic) – Protein/DNA complex – FABP4 – P53/MDM2 – β-catenin
• Other classes – Hsp90 – PTP1B – PARP – NS5B RNA polymerase – Arginase – Keap1 – M2 proton pump – Copper pump – RNA polymerase IV
Results confirmed experimentally In silico validation vs. known ligand binding
Infectious diseases
• Malaria - Dihydrofolate reductase
(p.falciparum) - Dihydrofolate reductase
(p.vivax)
• Bacterial - Rec A - Gyrase - Mur pathway proteins - D-Ala-D-ala ligase - Alanine racemase
• Tuberculosis - Nad+ synthetase - Malate synthase - pantothenate synthetase - isocitrate lyase
• HIV - GP41 - HIV protease - TAR RNA
• Ebola - Niemann-Pick C1 - Tsg101
• Alphavirus - nsP2 protease
16
Representative GC/MC Boltzmann Map Applications in the literature
• Guarnieri F, Mezei M. "Simulated annealing of chemical potential: A general procedure for locating bound waters. Application to the study of the differential hydration propensities of the major and minor grooves of DNA." J Am Chem Soc. 1996;118(35):8493-4.
• Kulp III JL, Kulp Jr JL, Pompliano DL, Guarnieri F. "Diverse fragment clustering and water exclusion identify protein hot spots." J Am Chem Soc. 2011;133(28):10740-3.
• Kulp III, JL, et al. “A fragment-based approach to the SAMPL3 Challenge”, J Comput Aided Mol Des, 2012; DOI 10.1007/s10822-012-9546-1.
• Clark M, Guarnieri F, Shkurko I, Wiseman J. "Grand canonical Monte Carlo simulation of ligand-protein binding." J Chem Inf Model. 2006;46(1):231-42.
• G. A. Ross, M. S. Bodnarchuk, and J. W. Essex, “Water Sites, Networks, And Free Energies with Grand Canonical Monte Carlo”, JACS 2015 137 (47), 14930-14943.
• M. Vallée et al., “Pregnenolone Can Protect the Brain from Cannabis Intoxication”, Science 343, 94 (2014). • Marron, TJ et al., “Solvation studies of DMP323 and A76928 bound to HIV protease: Analysis of water sites using
grand canonical Monte Carlo simulations”, Protein Science (1998), 7573-579. • Clark, M et al., “Fragment-Based Computation of Binding Free Energies by Systematic Sampling”, J. Chem. Inf.
Model., 2009. • Berk, P et al., “Molecular Modeling and Functional Confirmation of a Predicted Fatty Acid Binding Site of
Mitochondrial Aspartate Aminotransferase”, J. Mol. Biol. (2011) 412, 412–422. • Moore, W.R., Jr., “Maximizing discovery efficiency with a computationally driven fragment approach.” Curr Opin
Drug Discov Devel, 2005. 8(3): p. 355-64. • Mezei, M., “Grand-canonical ensemble Monte Carlo study of dense liquid Lennard-Jones, soft spheres and water.”
Mol. Phys., 1987. 61: p. 565-582. • Moffet, K et al., “Discovery of a novel class of non-ATP site DFG-out state p38 inhibitors utilizing computationally
assisted virtual fragment-based drug design (vFBDD)”, Bioorganic & Med. Chem. Letters 21 (2011) 7155–7165
17
Prioritize fragment hits from screens to reduce the number of dead-end chemistry paths taken
• Multiple binding sites (super-stoichiometric binding) – Determine the highest affinity site
• Is it functionally appropriate? – Docking/probing/X-ray can identify sites, but too many
• GC/MC does that and also enables ranking by affinity
• Accessibility of functional groups – Are functional groups oriented to allow extension?
• Modified without disrupting the binding (e.g. not buried)? • Oriented towards other accessible, high-affinity hot spots
or pockets not blocked by tightly-bound waters? – Statistically-valid pose distributions from GC/MC
provides answers • Robustness of chemistry progression opportunities
– Do other fragments bind in proximity to be linked? – Search fragment maps to enumerate, rank possibilities
18
No Yes++
No Yes
Yes--
No
No Yes+
Fragment maps can be productively used in a fragment-based ligand engineering process
• Similar to fragment screening methods1 – Abbott2, Astex3, Evotek, Carmot, … – Experimental methods, performance requirements are similar – Achieve leads with high ligand efficiency (IC50/weight),
novel, patentable chemistry
• But orthogonal, complementary to them – Experimental screening limited to high solubility, weak binders – Computed fragment maps are more general, fine-grain, diverse – Many fewer lead candidates are synthesized and tested
• High productivity, success rate (3 design chemists) – >20,000 fragment simulations per year, >4,000 QM calculations/yr. – Several hundred ligand designs evaluated per month – Designs translated into drug leads in all fully-funded programs
1 “Fragment-based lead discovery grows up”, Nature Reviews/Drug Discovery, 2013 Jan, 12, 5-7. 2 “A decade of fragment-based drug design: strategic advances”, Nature Reviews/Drug Disc., 2007 Mar, 6, 211-9. 3 “Experiences in fragment-based drug discovery”, Trends in Pharma. Sciences, 2012 May, 33, 224-32.
Designs translate into confirmed hits/leads
20
Target Client Novel Designs
Fragment Hits* (IC50 < 25 µM)
Ligand Hits** (IC50 < 1 µM)
Lead (IC50 < 100nM)
Status
enzyme Client 1 ü ü ü ü In clinic enzyme Client 2 ü ü ü ü Ag. Field test enzyme Client 3 ü ü ü ü X-ray confirmed DHFR Partner 1 ü ü ü ü completed HSP90 Partner 2 ü ü canceled PARP Internal ü ü deprioritized PTP1B Internal ü ü deprioritized Renin Internal ü ü ü ü %F data
PCSK9 Internal ü ü ü sold
RecA Internal ü ü Waiting funding NS5B Collab. ü ü ü Cell data PDF Client ü ü ü ü patent P38 Internal ü ü ü ü validation
7- 20 compounds synthesized/project
17- 40 compounds synthesized/project
*Fragmenthits150-250Da**Drug-like,cellac8vity300-400Da
Delivering on the promise to solve hard problems
• Identify “hot spots” – Key interactions for disrupting protein-protein interactions – Allosteric modulation sites – Binding site sub-pockets
• Non-obvious ideas for optimizations that preserve potency, addressing the “SAR Paradox” – Design for a range of physico-chemical properties for membrane
penetration, bioavailability, etc. – Avoid or strengthen patents
• Exploiting differential fragment binding patterns between different protein structure variations – Selectivity between isoform’s – Binding that is not sensitive to mutations – Multi-targeting protein pathways, patient sub-populations
• Difficult targets where screening has not yielded good leads – E.g. protein-protein interactions, peptide mimetics
21
Examples
Renin, Peptide Deformylase, RecA
22
Goals in renin inhibitor lead optimization
• Improve bioavailability (F%) • Generate new IP • Improve physical properties (cLogD) • Lower mol. wt. (<400 Da) while maintaining affinity
– i.e. better ligand efficiency
• Make a limited number of compounds (< 35)
23
Renin project
• Used 2IKO.pdb structure • Fragment maps used to:
– Discover novel scaffolds interacting with catalytic aspartates – Identify sub-pocket binding site not previously exploited – Determine more optimal linkage to heterocycles – Improve ligand efficiency (LE)
• With lower mol. wt. and broader range of cLogD values
• Round 1 – 15 compounds made and tested – IC50 range achieved: 600nM -- 10 µM
• Round 2 – 17 compounds made and tested – IC50 range achieved: 40 nM -- 250nM
24
Inhibitorinthe2IKOco-crystal
Two chemists, 6 months, outsourced synthesis & testing
Predicted binding pose of the fragments* used in design (orange) compared to co-crystal ligand (green) in 2IKO showing the interaction of the indole NH with GLY.223 and the ether fragment penetrating S3sp
25
EtherfragmentinS3spsite
GLY.223
Novelheterocycle
NeworientaGon 2IKOligand
Newdesign
*From Grand Canonical Monte Carlo simulations
Predicted binding pose of the fragments* used in design (orange) compared to co-crystal ligand (light green) in 2G1R showing the two different NH interactions with GLY.223 for the ligand and new design
26
DifferentlinkageposiGon
Etherfragment
GLY.223
2G1Rligand
Newdesign
Novelheterocycle
*From Grand Canonical Monte Carlo simulations
Correlation plot of computed FE vs. IC50 shows rank order is predictive within experimental error
27
R²=0.77423
4
4.5
5
5.5
6
6.5
7
7.5
8
-45-40-35-30-25-20-15-10-50
pIC5
0
FEPred(kcal/mol)
FEPredvspIC50forrenincompounds
BioLeapscaffold1
Literature
BioLeapscaffold2
CompoundsonthesameverGcallineareliteraturevalues(red)andtheretestedvalue(blue)
pKa results demonstrate improved bioavailability at lower mol. wt. compared to literature compounds
28
Renin: What we have done using the technology
29
• Identified 2 novel scaffolds, unique chemotypes with novel IP – Even though renin has been broadly studied for a long time
• Identified a previously underutilized interaction site on the protein – Used to drive up affinity while maintaining high ligand efficiency
• Found new head groups with 3X better affinity at the same mol. wt. – Mol. wt. < 400 Da for all compounds tested – good drug potential
• Shown pKa can be modified while maintaining affinity – Tolerate a range of lipophilicy yet maintain a cLogD in the ideal 2-3 range
• Modify properties that are key determinants for clinical candidates – cLogD, mol. wt., polar suface area (PSA)
• Making and testing only 32 compounds in < 6 months – Highly productive prioritization of optimization opportunities
Comparison of various renin lead op efforts
OrganizaGon Mol.Wt. LigandEfficiency*
F% Est.#Chemist-years
Reference
Merck 610Da511Da
.23
.3318%41%
2020
1,2
Pfizer 521Da .26 74% 30 3
BI 635Da .30 17% 30 4
Roche 600Da .30 <10% 60 5
Vitae 508Da .25 13% 8 6
BioLeap 396Da .33 58% 2 7
30
References:1.P.Lacombeetal./Bioorg.Med.Chem.Le_.20(2010)5822–5826.2.A.Chenetal./Bioorg.Med.Chem.Le_.21(2011)3976–3981.3.R.Saver/Anal.Biochem.2007Jan1;360(1):30-40.4.B.Simoneauetal./Bioorg.Med.Chem.7(1999)489-508.5.H.P.Ma¨rkietal.:IlFarmaco56(2001)21–27.6.C.M.Ticeetal./Bioorg.Med.Chem.Le_.19(2009)3541–3545.7.I.Cloudsdale,etal.,Submi_edforpublica8on2015,draiavailableuponrequest.
*LE = 1.4(-log(IC50)/N) N = # heavy atoms Higher is better
Summary: Boltzmann fragment maps improve productivity in lead op
• Identify often non-obvious chemistry ideas for scaffolds, linkages, chemotype substitutions or additions – Find overlooked sub-pockets
• Rank chemistry modifications by binding affinity – Whether motivated from fragment maps or from chemist – Quantitatively assess the impact of tightly-bound waters
• Prioritize the combinatorial number of possibilities for optimization based on obvious and non-obvious ideas – Make and test dramatically fewer compounds
• Enable broader patents – Using fragment maps to enumerate more possibilities
31
Peptide Deformylase – Predicting and Experimentally Confirming Water-Mediated Small Molecule Binding and Inhibition
32
Figure3.a)ThePDFbindingsiteconsistsoftheburiedFe(orange)andGlu133andtheexposedGlu42andArg97.SACPpredictsawatertriplet(b)thatbridgesGlu42andArg97.SACPpredictsthatthe2-hydroxamicindole(C)bindsdeeplyinthepocketH-bondingtoGlu133andhasahigheraffinitythanthe3-subs8tutedmolecule,whichispredictedtorotateout(d)ofthepocket.SACPpredictsthatN-methyla8ngthe3-sus8tutedcompound(e)isdestabilizing,becausethemethylgroupisfloa8nginavacuum.TheN-isopropylcompoundfillsthepocket(f)createdbythewatersandthusispredictedtohavehigheraffinity.
SCAP = Simulated annealing of chemical potential
Unanticipated PDF results predicted
BioLeap Confidential
y=0.3478x-0.2995R²=0.8388
-8
-7
-6
-5
-4
-3-21.00-19.00-17.00-15.00-13.00-11.00-9.00-7.00-5.00
pIC
50
FE Pred (kcal/mol)
PDF: Affinity vs Prediction
Non-obviousresultseasilypredicted
RecentdiscoveriesrevealthatallanGbioGcsactthroughafinalcommonpathwayofDNAdamage
• RecAisdirectlyinvolvedinac8va8ngtheSOSresponseandDNArepair• RecAmediatestheabili8esofmanybacteriatoovercometheDNA-damaging
radicalsinducedbyarangeofan8bio8cs• RecAinhibitorsareexpectedtohavebroadspectrumeffectsandnotarget-based
toxicity• Thereissubstan8alexperimentalevidencethatbacterialosetheabilitytodevelop
resistancetoan8bio8csorperformDNArepairopera8onsaierUVexposureifRecAisinhibited
RecAhomo-oligormerizesontosingle-strandedDNA,forminganacGvatednucleoproteinfilamentthatinducesSOSandcaneffectrecombinaGonalDNArepair
• Oneresidue,F217onthehomo-oligomersurface,whenmutatedtoY,causesa250xincreaseinbindingaffinity
• Determinedthattheinterfacecanbeinhibitedbysmallmolecules
Fo8JJ,DevadossB,WinkleJA,CollinsJJ,Walker,GC.“Oxida8onoftheguaninenucleo8depoolunderliescelldeathbybactericidalan8bio8cs.”SCIENCE.336,315–319(2012).
RecA: DNA Repair Inhibitor, Antibacterial
RecA: ssDNA binding and oligimerization
RecA + ATPRecA binds to ssDNA to trigger repair and recombination by SOS pathway
CryoEM of dozens of RecAs oligimerized on ssDNA (Micron, 24(3):309–324, 1993)
RecA involved in pathways of bacterial killing and of resistance to antibiotics
RecA knockout recovers up to 5 order of magnitude efficacy of antibiotics.
Collins JJ et al. Cell 130, 797–810, 2007
F217Y RecA mutation enhances binding by 250x
• Nine residues (A214-R222) are in the homooligomeric interface and five of the nine residues are identical in 64 RecA sequences
• K216, F217, and R222 have been shown to be intolerant to most mutations.1 Least tolerant is F217. Only a mutation to a tyrosine retains full RecA function.
• Interestingly, the F217Y mutation results in a 250-fold increase in the interaction between RecA subunits2 -- why?
1 M.C. Skiba and K.L. Knight. J Biol Chem, 269(5):3823–3828, Feb 19942 De Zutter JK, Forget AL, Logan KM, Knight KL (2001) Structure (Camb) 9: 47–55
QM calculations show a greater extent of charge delocalization in Y vs. F; K216 strongly polarizes the phenyl ring.
RecA: Hot-spot clustering identifies 3 biologically relevant sites
Cluster at RecA homo-oligomer interface
Multiple clusters in ATP binding site
Cluster in single stranded DNA binding site
Fragment maps reproduce conserved interactions at RecA-RecA interface
Three amino acids at the protein-protein interface universally conserved.
F217 K216 R222
Validation compound delays bacterial growth in a UV-irradiation-dependent manner
Predicted Inhibitor:2 orders of magnitude delay in growth
Predicted Non-Binder:No reduction in growth
+ - + - inhibitor + + - - UV irradiation (induced DNA damage)
5 h 5 h 3 h 3 h 0 h
• 2 compounds, MW ~250, known pharmaceutical class• Designed to block RecA oligimerization• Compound inhibits RecA in vitro with IC50 = 23 μM
(SOS gene reporter assay)• Next set of compounds designed link PPI site to ATP site