Chemical Similarity

  • View
    105

  • Download
    0

Embed Size (px)

DESCRIPTION

AACIMP 2009 Summer School lecture by Willie Peijnenburg. "Environmental Chemoinfornatics" course.

Text of Chemical Similarity

  • 1. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Chemical SimilarityWillie PeijnenburgRIVM Laboratory for Ecological RiskAssessment

2. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c Similarity : philosophers viewexploiting the similarity concept is a sign of immaturescience (Quine)it is ill defined to say A is similar to B and it is onlymeaningful to say A is similar to B with respect to CA chemical A cannot be similar to a chemical Bin absolute termsbut only with respect to some measurable key feature2 3. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c Similarity : chemists view Intuitively, based on expert judgment A chemist would describe similar compounds interms of approximately similar backbone andalmost the same functional groups.Chemists have different views on similarity Experience, context Lajiness et al. (2004). Assessment of the Consistency of Medicinal Chemistsin Reviewing Sets of Compounds, J. Med. Chem., 47(20), 4891-4896. 3 4. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Chemical similarityComputerized similarity assessment needs unambiguous definitions Structurally similar molecules have similar biological activities The basic tenet of chemical similarity Long supporting experience Many exceptions Exceptions are important! Identification of the most informative representation of molecular structures Avoiding information loss is important! Similarity measures 4 5. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Chemical similarity quantified Numerical representation of chemical structure Structural similarity Descriptor based similarity 3D similarity Field based Spectral Quantum mechanics More Comparison between numerical representations Distance-like Association, Correlation5 6. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c Structural similaritySubstructure searching Maximum Common Substructure Fragment approach Atom, bond or ring counts, degree of connectivity Atom-centred, bond-centred, ring-centred fragments Fingerprints, molecular holograms, atom environments Topological descriptors Hosoya Z, Wiener number, Randic index, indices on distance matrices of graph (Bonchev & Trinajstic), bonding connectivity indices (Basak), Balaban J indices, etc. Initially designed to account for branching, linearity, presence of cycles and other topological features Attempts to include 3D information (e.g. distance matrices instead of adjacency matrices)6 7. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom So: A single group makeswom w w w.w.A B B Y Y.c A B B Y Y.cStructural similarity difference but Isosteric replacements of Oral LD50 for male groups rats = 2.5g/kg Dermal LD50 for male rats = 3.54g/kgSubstituents: 3-(2-chloro-4-(trifluoromethyl)phenoxy-) phenyl acetate, Not irritating to eyes F, Cl, Br, I, CF3,NO2 CAS# 50594-77-9 of rabbits Methyl,Ethyl, Isoprpyl, Slightly irritating to skin of rabbitsCyclopropyl, t-Butyl,-OH,-SH,-NH2,-OMe,-N(Me)2 Not mutagenic in Salmonella strains Atoms and groups in rings: Higher potential -CH=,-N= binding affinity to the estrogen -CH2-,-NH-,-O-,-S- receptor than the nitrophenyl acetateMore Higher potential to cause cancer than 5-(2-chloro-4-(trifluoromethyl)phenoxy)-2-nitrophenyl acetate, the phenyl acetate Depends on the endpoint! CAS# 50594-44-0Walker . J. (2003) ,QSARs for pollution prevention, Toxicity Screening,(e.g. lipophilicity, receptorRisk Assessment and Web Applications, SETAC Pressbinding)7 8. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Structural similarityRosenkranz H.S., Cunningham A.R. (2001) Chemical Categories for Health Hazard Identification: A feasibility Study, Regulatory Toxicology and Pharmacology 33, 313-318. Examined the reliability of using chemical categories to classify HPV chemicals as toxic or nontoxic Found: most often only a proportion of chemicals in a category were toxic Conclusion: "traditional organic chemical categories do not encompass groups of chemical that are predominately either toxic or nontoxic across a number of toxicological endpoints or even for specific toxic activities The bold portion of the chemical in the Category columndefined the fragment used to query each data set.Abbreviations: EyI,eye irritation;LD50, rat LD50; Dev,developmental toxicity;CA, rodentcarcinogenesis; Mnt, in vivo induction of micronuclei; Sal,Salmonella mutagenesis; MLA, mutagenesis in cultured mouse lymphoma cells.8 9. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c 3D SimilarityDistance-based and angle-based descriptors (e.g. inter-atomic distance) Field similarity (not exhaustive list)Comparative Molecular Field Analysis (CoMFA), CoMSIAElectrostatic potentialShapeElectron densityTest probeAny grid-based structural property Molecular multi-pole moments (CoMMA) Shape descriptors (not exhaustive list)van der Waals volume and surface (reflect the size of substituents)Taft steric parameterSTERIMOLMolecular Shape Analysis4D QSARWHIM descriptors Receptor binding 9 10. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Structurally similar compounds can have very different 3D properties Kubinyi, H., Chemical Similarity and Biological activity10 11. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c Physicochemical properties Molecular weightOctanol - water partition coefficientTotal energyHeat of formationIonization potentialMolar refractivityMore11 12. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c Quantum chemistry approaches The wave function and the density function containall the information of a system.All the information about any molecule could be extractedfrom the electron density. Bond creation and bond breakingin chemical reactions, as well as the shape changes inconformational processes, are expressed by changes inthe electronic density of molecules. The electronic densityfully determines the nuclear distribution, hence theelectronic density and its changes account for all therelevant chemical information about the molecule.In principle, quantum-chemical theory should be ablequantum-to provide precise quantitative descriptions ofmolecular structures and their chemical properties.12 13. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Quantum chemistry approaches Quantum chemical descriptors - characterize thereactivity, shape and binding properties of acomplete molecule or molecular fragments andsubstituents:HOMO and LUMO energies, total energy, number of filledorbitals, standard deviation of partial atomic charges andelectron densities, dipole moment, partial atomic chargesApproaches from The Theory of Atom in Molecules BCP space, TAE/RECON, MEDLA, QShAR(additive density fragments)Quantum chemistry calculations depend on severallevels of approximationComputationally intensive13 14. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c ReactivitySimilarity between reactions Similarity of chemical structures assessed by generalized reaction types and by gross structural features. Two structures are considered similar if they can be converted by reactions belonging to the same predefined groups (for example oxidation or substitution reactions). 14 15. F T ra n sf o F T ra n sf o PD rm PD rm Y YYY er er ABB ABB y ybubu 2.0 2.0toto re re he hekk lic lic C C wom wom w w w.w.A B B Y Y.c A B B Y Y.c Similarity indicesAssociation, correlation, distance coefficients Most popular : Tanimoto distance (fingerprints)TAB N ABEuclidean distance (descriptors) NA N B N ABCarbo index (fields)C AB Z ABZ AA Z BB Essentially a classification problem has to be solved (decide if a query compound is closer to one or another set of compounds) Many methods available (Discriminant Analysis, Neural networks, SVM, Bayesian classification, etc.) Statistical assumptions and statistical error is involved15 16. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Similarity indices Association indices Correlation indicesJ. D. Holliday, C-Y. Hu and P. Willett,(2002) Grouping of Coefficients for the Calculation of Inter- Molecular Similarity and Dissimilarity using 2D Fragment Bit-Strings, Combinatorial Chemistry & High Throughput Screening,5, 155-166 15516 17. F T ra n sf oF T ra n sf o PD rmPD rm YYY Y erer ABBABB yybu bu 2.02.0to to rere hehek k liclic CC womwom ww w. w.A B B Y Y.cA B B Y Y.c Fingerprint similarity Information loss fragmentspresence and absence insteadof countsBit string saturation within alarge database almost all bitsare setCan give nonintuitive resultsThe average similarity appearsto increase with thecomplexity of the queryThe distribution of Tanimoto values compound found in database searches with aLarger queries are more range of query molec