50
Copyright © 2006 Turquoise Consulting. All Rights Reserved Overview of 3D Searching: A Powerful Tool for Computer- Assisted Molecular Design R. S. Pearlman 1 and O. F. Güner 2 1 College of Pharmacy, University of Texas, Austin, TX 2 Turquoise Consulting, San Diego, CA Copyright © 2006 Turquoise Consulting. All Rights Reserved

Copyright © 2006 Turquoise Consulting. All Rights Reserved Overview of 3D Searching: A Powerful Tool for Computer-Assisted Molecular Design R. S. Pearlman

Embed Size (px)

Citation preview

Copyright © 2006 Turquoise Consulting. All Rights Reserved

Overview of 3D Searching:A Powerful Tool for Computer-Assisted

Molecular Design

R. S. Pearlman1 and O. F. Güner2

1College of Pharmacy, University of Texas, Austin, TX2Turquoise Consulting, San Diego, CA

Copyright © 2006 Turquoise Consulting. All Rights Reserved

2 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Drug action is a 2-phase process; drug discovery must address both phases: Transport of drug from site of administration (e.g., GI-tract if

oral, blood or muscle if injected) to the “biophase” where the receptor is located

Does the compound have appropriate physical chemical properties?

Is it drug-like? Is it bioavailable?

Interaction of drug with receptor Does the compound have the right size, shape, and

substructural features (e.g., pharmacophore) to enable favorable interaction with receptor?

Does it show sufficient “affinity” for receptor and/or display intrinsic “activity”?

Introductory Remarks

3 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Outline

What is 3D searching?

Why perform 3D searching?

Essential components required

Brief example

4 Copyright © 2006 Turquoise Consulting. All Rights Reserved

What is 3D Searching?

Searching within large databases of 3D chemical structures for those compounds which satisfy both the chemical and geometric requirements specified in the 3D search query

The search typically reflects the chemical and geometric requirements for a ligand to interact favorably with a particular bio-receptor

That is, the search query usually reflects “the pharmacophore”

3D Searching review articles VanDrie, J. H. “3D Database Searching in Drug Discovery,”

http://www.netsci.org/Science/Cheminform/feature06.html Güner, O. F. and Henry, D. R. “Three-dimensional Structure Searching,” in

The Encyclopedia of Computational Chemistry; Schleyer, P. v. R.; Allinger, N. L. Clark, T.; Gasteiger, J.; Kollman, P. A.; Schaefer III, H. F.; Schreiner, P. R. (Eds.): John Wiley & Sons: Chichester, 1998, vol 5, pp 2988-3003.

Kurogi, Y. and Güner, O. F. “Pharmacophore Modeling and Three-dimensional Database Searching for Drug Design Using Catalyst,” Curr. Med. Chem. 2001, 8, 1035-1055.

5 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Outline

What is 3D searching?

Why perform 3D searching?

Essential components required

Brief example

6 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Why Perform 3D Searching?

Ideas for extending existing leads

Ideas for new leads

Retrieving “active” compounds from commercial databases

“We use 3D searching to break other people’s patents.”(anonymous --- AAPS, 1989)

“Cover” planned patents Examples of pharmacophores in patents:

WO 98/04913 – Biogen patent on VLA-4 inhibitors

WO 98/46630 – Peptide therapeutics on Hepatitis C NS3 inhibitors

US 2002/0013372 – Pfizer on CYP 2D6 inhibitors

Used to validate pharmacophore models

Used in “reverse” to predict other activities

7 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Three “Levels” of Computer-Assisted Drug Design

No knowledge of receptor structure Classical QSAR Modern QSAR Recursive partitioning

Limited knowledge of receptor structure 3D searching/screening based upon structural complementarity (e.g.,

Catalyst, ISIS/3D, Unity, Phase) Ligand-based pharmacophores (e.g., DISCOtech, GASP, HipHop) Pharmacophore-based QSAR, (e.g., CoMFA, HypoGen)

Complete knowledge of receptor structure “Structure-based drug design”

Docking (e.g., Glide, C2.LigandFit, DOCK, FlexX, GOLD) De Novo design (e.g., LeapFrog, Ludi)

Receptor-based pharmacophores (e.g., C2.SBF, Catalyst, Unity) Screening based on computational estimates of drug-receptor “interaction

energy”

8 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Searching Software – 2D vs 3D2D Substructure (and similarity) searching User specifies atoms and how they are connected

Users can’t discard undesired “hits” based on 3D geometry of those hits (no 3D information allowed)

Therefore, user must discard undesired hits based on connectivity and, thereby, pre-determines a large fraction of sub-structural information about hits

3D Searching User specifies atom-types and their relative

position in 3D-spaceUser does not specify how atoms are connectedUser does not pre-determine “chemistry” of hits

9 Copyright © 2006 Turquoise Consulting. All Rights Reserved

2D vs 3D Searching

Unconstrained search for all compounds containing >C=O, -OH, and –CL would return many spurious hits (“false positives”)

3D search: uses 3D geometry to constrain positions of chemical features

2D search: uses 2D connectivity to constrain positions of chemical features

10 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Outline

What is 3D searching?

Why perform 3D searching?

Essential components required

Brief example

11 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Essential Components Required for 3D Searching

Large number of interesting 3D structures

Database management software for storage and retrieval

Software to perform search based upon 3D criteria

Rational 3D search criteria

12 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Methods for Acquiring 3D Structures

Experimental determination (e.g., X-Ray, NMR) CSD (CCDC), ca. 100,000 organic compounds Most are not pharmacologically relevant

MM or MO geometry optimization Requires 3D (not 2D) initial structures

License commercial database of 3D structures

Convert corporate 2D database to 3D structures and hunt for “buried treasure” Corporate structures, virtual libraries, combinatorial libraries

13 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Comments on 3D DatabasesThe value of searching software is limited by the value of databases being searched

Size matters (not too big, not too small)

Diversity (sometimes low in corporate databases)

“Richness” “Relevance” of structures Non-structural information Availability of compounds

Stereochemistry Needs to be dealt with either within the database, search

query, or search algorithm

14 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Software for Generating 3D Databases

CONCORD Pearlman, R. S. “Rapid Generation of High Quality Approximate 3D

Molecular Structures,” Chem. Des. Aut. News, 1987, 2, 1-7.

CORINA Hiller, C.; Gasteiger, J. “Ein Automatisierter Molekülbaukasten,” in

Software-Entwicklung in der Chemie, vol. 1; Gasteiger, J. Ed.; Springer, 1987, Berlin, pp 53-66

WIZARD Dolata, P. D.; Leach, A. R.; Prout, K., “Wizard: AI in conformational

analysis,” J. Comput.-Aided Mol. Des., 1987, 1, 73-85.

AIMB Wipke, W. T.; Hahn, M. A., “AIMB: Analogy and intelligence in

model building. System description and performance characteristics,” Tetrahedron Comput. Meth., 1988, 1, 141.

15 Copyright © 2006 Turquoise Consulting. All Rights Reserved

CONCORD – General Capabilities

Converts CONnection table to 3D CoORDinates 2D CT contains information about connectivity per se 2.5D Contains additional information about stereochemistry

Handles almost all “drug-sized” compounds

Handles input/output in a wide variety of ways

Very fast

Good to excellent structures

Limitations No inorganics or metallo-organics Single low energy conformation Not intended for macrocycles, polymers, or other highly flexible

structures for which 3D structure is determined by extrinsic rather than intrinsic forces

16 Copyright © 2006 Turquoise Consulting. All Rights Reserved

CONCORD Algorithm

“Expert-system approach with MM cleanup

Uses rule-based “chemical intuition” when applicable (most acyclic substructures)

Uses pseudo-molecular mechanics approach when “intuition” not applicable (most cyclic substructures) A novel strain function is minimized Strain is a function defined such that minimization is performed over

a single, composite variable

Initial structure is checked for close-contacts; dihedrals causing close-contacts (or all acyclic dihedrals) are then relaxed by ultra-fast MM optimization Carried out in torsion space, using analytical gradients, and with

substantial topological speed-ups

17 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Commercially Available 3D Databases

ACD, CMC, MDDR – MDL Information Systems, Inc.

Pomona-90C – Daylight Information Systems, Inc.

CAST-3D, CAS Registry File – Chemical Abstract Services

TRIAD – UC Berkeley

CHDC, NCI – Tripos Inc,

CAP, Maybridge, NCI, WDI– Accelrys Inc.

NCI – Nat’l Cancer Institute

Plus Corporate databases and chemical supply houses

18 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Essential Components Required for 3D Searching

Large number of interesting 3D structures

Database management software for storage and retrieval

Software to perform search based upon 3D criteria

Rational 3D search criteria

19 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Database Management Software (DBMS)

General database management software There are many examples… all chemistry ignorant (data

“name” oriented)

Chemical database management software Accelrys, CAS, Daylight, MDL, Tripos, Oracle*

Unique, structure-related storage key Searchable by structure, as well as name, etc. Searchable by 2D substructure keys Searchable by 3D substructure keys Integrated queries (including biological, chemical data, etc.)

Some include 3D shape based searches Some interface with other modeling and analysis software tools

20 Copyright © 2006 Turquoise Consulting. All Rights Reserved

DBMS -- Keys

“short-cuts”

Bit strings (a.k.a. “fingerprints”) – is a particular feature present? Yes or no?

2D Substructural keys

3D object/distance keys Object: atom, lone-pair, ring-centroid, projected point, etc. Distance (rigid): single distance bins Distance (flexible): ranges of bins 3D shape-based keys 3, 4-point pharmacophore keys

21 Copyright © 2006 Turquoise Consulting. All Rights Reserved

2D Substructural Keys

Which pre-defined 2D substructures are present in this compound?

22 Copyright © 2006 Turquoise Consulting. All Rights Reserved

3D Object/distance Keys

Which inter-object distances are present within this conformation?

23 Copyright © 2006 Turquoise Consulting. All Rights Reserved

3D Object/distance Keys [Flexible Search with UNITY and ISIS/3D]

Which inter-object distances could be achieved by this compound?

Max/min distance ranges

24 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Flexible Search with Catalystcrefs from database crefs from rigid hits = crefs for screen

Key-basedscreeningwith loosetolerancequery

crefs for fit

Fitting with loosetolerance query

crefs for flexible fit

Fit optimization

Check thresholds?

Bad - loop back

Good - break out

Energy minimization

Flexible Hit

Flexible + Rigid Hits Hit List

Parallelized !

25 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Essential Components Required for 3D Searching

Large number of interesting 3D structures

Database management software for storage and retrieval

Software to perform search based upon 3D criteria

Rational 3D search criteria

26 Copyright © 2006 Turquoise Consulting. All Rights Reserved

History and Evolution of 3D DBMS Software

1974 MOLPAT – Gund, Wipke, Langridge - Princeton

1982 DOCK – Kuntz et al. – UC San Francisco

1987-88 – Fast 3D Builders CONCORD, CORINA, WIZARD, AIMB

1988 Caveat – Bartlett – UC Berkeley

1988 3D Search – Sheridan et al. Lederle Labs

1989 Aladdin – Van Drie, Martin – Abbott Labs

1989 MACCS-3D – Henry et al., MDL

1990 ChemDBS-3D – Davies et al., Accelrys

1991 UNITY – Hurst et al., Tripos

1992 ISIS/3D - Henry et al., MDL

1992/3 Catalyst – Van Drie, Kahn - Accelrys

27 Copyright © 2006 Turquoise Consulting. All Rights Reserved

MACCS-3D

Screen from MACCS-3D displaying a hit retrieved from MDDR-3D based on a CNS active drugs pharmacophore from: Lloyd, E.J. and Andrews, P.R. J. Med. Chem. 1986, 29, 453.

28 Copyright © 2006 Turquoise Consulting. All Rights Reserved

ISIS/3D

Screen from ISIS/3D displaying a dopamine antagonist pharmacophore proposed by Martin, Y.C. Tetrahedron Comput. Meth. 1990, 3, 15-25; and a hit retrieved from MDDR-3D.

29 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Unity

A UNITY query based on a set of muscarinic M3 receptor antagonists (Marriott, D.P., Dougall, I.G., Meghani, P., Liu, Y., and Flower, D.R. J. Med. Chem. 1999, 42, 3210) developed using DISCOtech and refined via Tripos’ Pharmacophore Model Analysis tools.

30 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Catalyst

Screen from Catalyst displaying a hit retrieved from Maybridge database based on an angiotensin II blockers pharmacophore developed by Peter Sprague. The conformation of the hit with the highest score is shown overlaid with the original query.

31 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Phase

32 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Query Development

2-Fold objective Find leads to active compounds Don’t find leads to inactive compounds

Iterative process Initial query Validate Modify or improve

Build small, development database (training set) Include actives and “relevant” inactives Avoids ambiguities caused by hits of unknown activity

33 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Searching Software -- Queries

Non-structural criteria

Chemical criteria: atom types or 2D substructures (fragments)

Geometric criteria: 3D constraints between objects Representation of pharmacophore

Shape criteria Inclusion volumes, exclusion volumes

34 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Searching Software -- Queries

Atom-types: Element Charge, hybridization, connectivity, ring, etc.

Objects Atoms, pharmacophore features (e.g., hydrophobe) Points, vectors, planes, spheres, etc.

Constraints Distances Angles, dihedrals RMS deviations Inclusion, exclusion volumes

35 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Importance of “Forbidden Regions”

Example: rigid search, same atomic constraints No forbidden regions: 2,016

hits 2 “sides of box”: 1,377 hits 4 “sides of box”: 383 hits 5 “sides of box”: 46 hits

Forbidden regions are even more important for flexible searching

Explore and report inactivity

36 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Searching Software – Search Phases

Key-based screening (rapid screen-out phase) The objective is to quickly eliminate all those compounds

that cannot possibly satisfy the query Use keyed structural information Consider one-constraint at a time

Atom-by-atom mapping (slower geometric search) The objective is to actually verify that the compound satisfies

the query Consider all constraints simultaneously Conformational flexibility issue needs to be addressed Stereochemistry

37 Copyright © 2006 Turquoise Consulting. All Rights Reserved

3D Searching Process

Database

Subset

Hits

Query input

Hit list output

Databases andSpreadsheets

Key-based screening2D, 3D, 4D, 1D

Atom-by-atommapping

38 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Conformational Flexibility

The issue: If the query reflects putative bound conformation which is

different than the low energy conformation, then search over database of low energy conformations might miss some interesting hits.

Approaches to the issue: Handle the conformational flexibility within the database, by

storing multiple conformations of each compound Handle the conformational flexibility within the searching

query via flexible queries Handle the conformational flexibility within the search

process, via on-the-fly conformational exploration

39 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Addressing Conformational Flexibility – in the Database

Store multiple conformations of each compound Too many to store (33 = 27, 124 = 20,736) Too many to search Still no guarantee that bound conformation is amongst those

stored

Example reference: Murrall, N. W.; Davies, E. K. “Conformational Freedom in 3-D

Databases,” J. Chem. Inf. Comput. Sci., 1990, 30, 312-316.

40 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Addressing Conformational Flexibility – in Search QueryBuild conformational flexibility into query Specify ranges for geometric constraints

Increasing range increase false positives

Decreasing range increase false negatives

Specify “hinge” points Differentiate parts of the query dealing with flexible regions

Generality of query may be compromised

Example references: Güner, O. F.; Henry, D. R.; Pearlman, R. S. “Use of Flexible Queries for

Searching Conformationally Flexible Molecules in Databases of Three-Dimensional Structures,” J. Chem. Inf. Comput. Sci. 1992, 32, 101-109.

Güner, O. F.; Henry, D. R.; Moock, T. E.; Pearlman, R. S. “Flexible Queries in 3D Searching. 2. Techniques in 3D Query Formulation,” Tetrahedron Comp. Meth. 1990, 3(6C), 557-563.

41 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Addressing Conformational Flexibility – during Search

Explore conformational flexibility at search-time Rigid: does this conformation match query? Flexible: could this compound match query? 2-phase process:

First rapid screen based upon max/min distance keys

Then, slower conformational search

Ensures that no hits are missed (however, it is important note local minima problem)

Example references: Moock, T. E.; Henry, D. R.; Ozkabak, A. G.; Alamgir, M. “Conformational

Searching in ISIS/3D Databases,” J. Chem. Inf. Comput. Sci., 1994, 34, 184-189.

Hurst, T. “Flexible 3D Searching: the Directed Tweak Technique,” J. Chem. Inf. Comput. Sci., 1994, 34, 190-196.

42 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Conformational Coverage in 3D Databases

A. Smellie, S.L. Teig, and P. Towbin, "Poling: Promoting Conformational Coverage", J. Comp. Chem., 1995, 16, 171-187.

A. Smellie, S.D. Kahn, and S. Teig, "An Analysis of Conformational Coverage 1. Validation and Estimation of Coverage", J. Chem. Inf. Comput. Sci., 1995, 35, 285-294.

A. Smellie, S.D. Kahn, and S. Teig, "An Analysis of Conformational Coverage 2. Applications of Conformational Models" , J. Chem. Inf. Comput. Sci., 1995, 35, 295-304.

– The most commonly used approach is to store multiple diverse conformations in the database and perform a flexible search

• ISIS/3D – performs a random kick before giving up on a conformation

• UNITY – performs user defined number of kicks

• Catalyst – uses “poling” algorithm to store multiple conformations (av. >30 conf.s)

43 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Essential Components Required for 3D Searching

Large number of interesting 3D structures

Database management software for storage and retrieval

Software to perform search based upon 3D criteria

Rational 3D search criteria

44 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Rational 3D Search Criteria: Pharmacophore Perception

Relatively easy if receptor structure is known Otherwise, based on analysis of actives and inactives

Relatively easy if compounds are rigid Otherwise difficult and expensive

E.g., Active analog approach: Constrained systematic search, considering intersection of

accessible conformation-space of all compounds in the training set.

The concept of pharmacophores and detailed examples of pharmacophore development will be covered in the next lecture

45 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Outline

What is 3D searching?

Why perform 3D searching?

Essential components required

Brief example

46 Copyright © 2006 Turquoise Consulting. All Rights Reserved

ACE Inhibitors

47 Copyright © 2006 Turquoise Consulting. All Rights Reserved

ACE Inhibitors Pharmacophore Model

Object-1: Zn-ligand (sulfhydryl or carboxylate oxygen)

Object-2: H-bond acceptor (N, O, or F)

Object-3: anion (--CS-, --COO-, --SO4-2, or –-PO4-3)

Object-4: indicates direction of lone-pair on object-2

Object-5: “central” atom in anion labeled object-3

Mayer, D.; Naylor, C. B.; Motoc, I.; Marshall, G. R. “A unique geometry of the active site of angiotensin-converting enzyme consistent with structure-activity studies,” J. Comput.-Aided Mol. Des. 1987, 1(1), 3-16.

48 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Note Regarding Geometric Search Criteria

Conformation space of potential ligands is, generally, multi-dimensional space of high volume

Combination of apparently broad criteria on each of the several axes results in greatly reduced volume of conformation space to be explored

Rubics cube example

49 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Search for ACE Inhibitors

Search performed at Lederle laboratories by Sheridan, R. P.; Nilakantan, R.; Rusinko, A. III; Bauman, N.;

Haraki, K. S.; Venkataraghavan, R., “3DSEARCH: A system for three-dimensionsl substruture searching,” J. Chem. Inf. Comput. Sci., 1989, 29, 255-260.

Found 96 “hits” in their corporate database of 223,988 structures

Required ca. 7 VAX-8650 CPU minutes

[would require ca. 2 SGI R10k seconds]

50 Copyright © 2006 Turquoise Consulting. All Rights Reserved

Summary

3D Searching works but requires a team effort: Laboratory synthesis and testing (and/or HTS) Molecular modeling for query refinement Tight interface between modeling and searching software Hit list analysis, prioritization, post processing

3D Searching can spark chemists’ imagination

The more information provided by chemists, the more information returned by 3D search