Tim Cheeseright, Assessing the Similarities of Compound collections using molecular fields: Does it...

Preview:

DESCRIPTION

This presentation, originally given at the 2012 ACS National Meeting in San Diego, investigates alternative methods of defining chemical space using 3D Field based methodologies - the advantages and disadvantages of which are described.

Citation preview

1

Assessing the similarity of compound collections using molecular fields: Does it add value?

Tim Cheeseright, Mark Mackey, Rob Scoffin, Martin Slater

2

Conclusions

> It works brilliantly

> All synthetic steps gave yields of 100%

> All enrichments were perfect

> All new molecules were sub nM

> All QSARs were totally predictive, q2 = 1.0

> We expect the call from Sweden any day now

3

Conclusions

> Work in progress

> 3D similarity can add value to compound selection

> Full matrix of similarities possibly unnecessary

> Using probes looks like a possible solution

> Not a panacea

4

Agenda & Background

> Fields & similarity

> Generating screening compounds using Fields

> Selecting a 10K “diverse” library for screening from commercial compounds> Initial thoughts

> Problems

> More Initial thoughts

> A solution but not a complete one

> Conclusions

5

NN

Br

F FF

SH2NO

O

Field Points

Condensed representation of electrostatic, hydrophobic and shape properties (“protein’s view”)

> Molecular Field Extrema (“Field Points”)

3D Molecular Electrostatic

Potential (MEP)

Field Points= Positive = Negative= Shape= Hydrophobic

2D

6

Improved MM Electrostatics

> Field patterns from XED force field reproduce experimental results

Interaction of Acetone and Any-OH from small molecule

crystal structures

Experimental Using XEDs

C O

-0.5

-0.5

-0.5

-0.5

-0.5

-1.75

-1.75

+5

+1

H

-0.5

-0.5

+0.9

+0.1

Not using XEDs

XED adds ‘p-orbitals’ to get better representation of atoms

7

Non-Classical Comparisons

8

Molecular Alignment

0.66 0.98

0.82

Cheeseright et al, J. Chem Inf. Mod., 2006, 665

9

Using Fields

> Bioisosteric groups

> Virtual Screening

> Pharmacophore hypothesis

> Qualitative SAR interpretation

> 3D QSAR

> Library Design

10

Field based library design success

11

Libraries from Fields

> Small, custom synthesised libraries (~100s - 1000s compds)

> Low scaffold diversity

> Highly targeted

> Lots of manual design

12

An Opportunity & a Challenge

> Provide a small diverse screening library 10K for a small biotech company

> Diversity in potential biological targets to be hit

> Minimum redundancy in the set

> Maximum chance of success in finding a lead within available budget and screening resources

13

Initial thoughts

> Customised design not an option - commercial compounds only

> Using Fields to successfully select compounds for screening performed many times> Virtual screening

> Always in a specific biological context

> What about using Fields to choose a ‘diverse’ set

> Possible problem with numbers > 10,000 cmpd library small

> 9,000,000 commercially available molecules v. large for 3D diversity

14

Initial thoughts

> Compare 3D and 2D similarities for compound collections - are we wasting our time?

> Take a small compound collection

> Full NxN calculation

> 3D method = Fields & Shape

> 2D method = atom pairs

> Compare and Contrast

15

Conformations

> 3D Method requires conformations - which one(s) to use?

> What is the similarity of 2 compounds in 3D ?> Context is important!

> Highest across all conformations?

> Average ?

> Lowest ?

> For 3D, similarity calculation is Nconfs x Nconfs

17

Compound Collection

> BIONET 'Rule of Three' ('Ro3') Fragment Library: “7,907 'Ro3'-compliant fragments”

> Conformation hunt on every fragment Maximum of 5 conformations (!)

> Full N x N similarity matrix, 3D & 2D (60 Million data points)

> ~30 compounds failed conformation hunting

18

Problems

> 400Mb of data

> Tedious to use and examine

Pilot study just using the first 500 compounds> Some chemical families in this area

> Still a large dataset to deal with (250,000 data points)

> 2D similarities and fragments> Small changes cause disproportionately high changes

> Atom pairs particularly bad

> Switch to KNIME fingerprints

All 2D values lower than ‘normal’

19

Comparing 2D and 3D metrics

Agreement

22

N NHO

O

Cl

Example - Similar Scores

N NHO

O Cl

Cl

101 104

2D sim = 0.9

3D field sim = 0.87

23

Example - Higher 3D Sim

2D sim = 0.1(other methods=0.3)

3D field sim = 0.82

S

O BrN

HNO

24

O

O

O

O

HN

Example - Higher 3D Sim

141

2D sim = 0.2

3D sim = 0.7

454

25

NCl

O

HN

O

437

2D sim = 0.3

(other methods 0.55)

3D field sim = 0.8

440

NS Cl

O

HN

O

O

Example - Higher 3D Sim

26

So…

> Pilot study suggests some added value

> Full matrix painful even if we could calculate it

> What about a reduced matrix? > Use ‘Probe’ compounds to tease out molecules that are

different in Field space

How many probes?

Across how many molecules

> We were running out of time…

27

Compound selection by Field Diversity

> Proposed workflow for generation of a field diverse library:

9M commercial compounds

Calc. 200 X 2002D similarity

matrix

Pick 20K sub-set

Pick 100 Diverse

Field probes

Calc. Shape Diversity by

PMI

Pick 200 sub-set

Property Filters

Calc. 20K X 100Field similarity

matrix

Pick 12KField

Diverse set

3D PCA on Field matrix

1.2M

30

Field Diverse library: Outcome

12K ‘Field Diverse’ library mapped by 3D PCA on the

100 x 20,000 ‘Field Similarity Fingerprint’

Distinct separation of charged species within

this space

AmmoniumsPiperidines

….so what!!

Benzoic and aliphatic acids

31

Field Diverse library: Outcome

12K ‘Field Diverse’ library mapped by 3D PCA

Distinct separation of by molecules by size within

this space

….so what!!

DecreasingSize

32

Deeper - Moderate ‘Field Similarity’

Alignment to ‘template1’

33

Deeper - Moderate ‘Field Similarity’

Alignment to ‘template1’Random selection of mols

35

Deeper - Moderate ‘Field Similarity’

Alignment to ‘template’

36

Is the chemical space sensible?

Small sulphonamides

Large esters

Two example clusters

37

Conclusions

> Work in progress

> Full similarity matrix shows potential of 3D sim to add value

> Full matrix difficult to handle and possibly unnecessary

> Using probes looks like a possible solution

> Not a panacea - still need to play the numbers game

38

Acknowledgements

> Cresset> Martin Slater

> Rob Scoffin

> Mark Mackey

> James Melville

> Mission Therapeutics> Keith Menear

Recommended