43
Wendy Warr & Associates Tautomerism in chemical information management systems Dr. Wendy A. Warr http://www.warr.com

Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

  • Upload
    others

  • View
    28

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Wendy Warr & Associates

Tautomerism in chemical

information management

systems

Dr. Wendy A. Warr

http://www.warr.com

Page 2: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Tautomerism in chemical information

management systems

Author: Wendy A. Warr

DOI: 10.1007/s10822-010-9338-4

Wendy Warr & Associates

Perspectives Issue Devoted to

Tautomerism in Molecular Design

Edited by Yvonne Martin

Page 3: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

“Chemical Information”

Aspects

• Registration procedures

• Storage of tautomers

• Exact and substructure search

• Depiction of results

Wendy Warr & Associates

Page 4: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Software and Database Vendors

• Accelrys

• ACD/Labs

• Beilstein/Reaxys

• CambridgeSoft

• CAS

• CCDC

• CCG

• ChemAxon

• ChemoSoft

• ChemSpider

• CWM Global Search

• Daylight

• Dialog

• IDBS

• InfoChem

• InhibOx

• John Wiley & Sons

• Molecular Networks

• NCI/CADD

• OpenEye

• Thieme

• PubChem

• Questel

• Schrödinger

• SciTouch

• Symyx

• Thomson Reuters

• Xemistry (CACTVS)

Wendy Warr & Associates

Page 5: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Not Included

• ABCD (J&J)

• BioRad (KnowItAll)

• CDK

• eMolecules

• SimBioSys

• Tripos

• ZINC

Wendy Warr & Associates

Page 6: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Chemical Structure

Representation

Wendy Warr & Associates

Page 7: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Morgan Algorithm

Wendy Warr & Associates

Morgan, H. L. The generation of

a unique machine description for

chemical structures - a

technique developed at

Chemical Abstracts Service. J.

Chem. Doc. 1965, 5(2),107-113.

Page 8: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

CTfile

Wendy Warr & Associates

Page 9: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

SMILES

Wendy Warr & Associates

CC1=CC(Br)CCC1

Page 10: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

SMILES

• OpenEye canonical SMILES

• Daylight canonical SMILES

• SciTouch canonical SMILES

• ChemAxon canonical SMILES

• …

Wendy Warr & Associates

Page 11: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

IUPAC International Chemical

Identifier (InChI)

Wendy Warr & Associates

InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3

InChIKey=RYYVLZVUVIJVGH-UHFFFAOYSA-N

Page 12: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

NCI/CADD Identifiers

(CACTVS Hashcodes)

Wendy Warr & Associates

9850FD9F9E2B4E25-FICTS-01-579850FD9F9E2B4E25-FICuS-01-789850FD9F9E2B4E25-uuuuu-01-27

Page 13: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Definition of Tautomerism

Q = C, N, S, P, Sb, As, Se, Te, Br, Cl or I

M, Z = trivalent N, bivalent O, S, Se or Te

[Either M or Z = C]

H = H, D, T [or + or -]

Extended system, ring/chain, etc.

Wendy Warr & Associates

M=Q-ZH HM-Q=Z

Page 14: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Straightforward

Wendy Warr & Associates

1,7 shift

1,3 shift 1,5 shift

Page 15: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

More Complex

Wendy Warr & Associates

1 2 43

Page 16: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Degree of Unsaturation

Wendy Warr & Associates

Page 17: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Ring Opening

Wendy Warr & Associates

Page 18: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Fluxional structures

Wendy Warr & Associates

Page 19: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Mesomers

Wendy Warr & Associates

NEMA Key=6P1SUP7NENNHV4V61WRZP5S2ES8NZF NEMA Key=CKGEHDBX4KZPW3VV6DXTVM5BB689GB

InChI=1S/C16H18N3S.ClH/c1-18(2)11-5-7-13-15(9-

11)20-16-10-12(19(3)4)6-8-14(16)17-13;/h5-10H,1-

4H3;1H/q+1;/p-1

InChI=1S/C16H18N3S.ClH/c1-18(2)11-5-7-13-15(9-

11)20-16-10-12(19(3)4)6-8-14(16)17-13;/h5-10H,1-

4H3;1H/q+1;/p-1

CXKWCBBOMKCUKX-UHFFFAOYSA-M

Same InChIKey

Different NEMA Keys

Page 20: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Tautomers

Wendy Warr & Associates

NEMA Key=CU3YSHT7DX8KUTKGRNS5GH3B4UQBFA NEMA Key=CTDBHWQW8CQJHC3S5AH6X4QJWAVMKD

InChI=1S/C10H13N5O3/c11-8-7-9(14-10(17)13

-8)15(4-12-7)6-2-1-5(3-16)18-6/h4-6,16H,1-3H2,

(H3,11,13,14,17)/t5-,6+/m0/s1

InChI=1S/C10H13N5O3/c11-8-7-9(14-10(17)13

-8)15(4-12-7)6-2-1-5(3-16)18-6/h4-6,16H,1-3H2,

(H3,11,13,14,17)/t5-,6+/m0/s1

KITPKMKMNZXFDK-NTSWFWBYSA-N

Different NEMA Keys

Same InChIKey

Page 21: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Unreasonable

Wendy Warr & Associates

Page 22: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Multiple

Overlapping

Wendy Warr & Associates

5

6

7

8

Page 23: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Overlapping

Wendy Warr & Associates

9 10 11

Page 24: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Registration

Wendy Warr & Associates

Page 25: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Registration Objectives

• Corporate database

• Stock room database

• Predicting spectra

• Reaction mechanisms

• Ultra-low temperature lab

Wendy Warr & Associates

Page 26: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Registration Options

• Enumerate all tautomers; store all

tautomers

• Calculate canonical tautomer; store

canonical tautomer

• Enumerate all tautomers

– Rank [as major, minor, or conditions

dependent (ACD/Labs)]

– Allow user to choose which form to store

Wendy Warr & Associates

Page 27: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Schrödinger

• Epik

– Enumerate all energetically reasonable

tautomers

– Enumerate all energetically reasonable

ionization states

• Store all tautomers and ionization states

• Canvas

– identifies duplicates by canonical SMILES

–Wendy Warr & Associates

Page 28: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Are A and B Tautomers?

• If A and B are identical, accept

• If the total number of hydrogen atoms or charges is

not identical, reject

• Examine the heavy-atom skeletons; reject if not

identical

• Enumerate all tautomers for A; if any is the same as

B, accept

• Enumerate all tautomers for B; if any is the same as

A, accept

• Otherwise reject.

Wendy Warr & Associates

Page 29: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Enumeration of tautomers

• Sayle, R. A.; Delany, J. J. Canonicalization and enumeration of

tautomers. Paper presented at EuroMUG99, Cambridge, UK,

28-29 Oct 1999

• Oellien, F.; Cramer, J.; Beyer, C.; Ihlenfeldt, W-D.; Selzer, P. M.

(2006) The impact of tautomer forms on pharmacophore-based

virtual screening. J. Chem. Inf. Model. 2006, 46, 2342-2354.

• Greenwood, J. R.; Calkins, D.; Sullivan, A. P.; Shelley, J. C.

Towards the comprehensive, rapid, and accurate prediction of

the favorable tautomeric states of drug-like molecules in

aqueous solution. J. Comput.-Aided Mol. Des. 2010, published

online March 31, 2010

Wendy Warr & Associates

Page 30: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Storage of Tautomers

Wendy Warr & Associates

Page 31: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Concept A

• Generate all tautomers

• Impossible to calculate lowest energy

tautomer

• Use rules for consistent generation [of a

low energy form]

• Store this form [as canonical SMILES]

Wendy Warr & Associates

Page 32: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Concept B

• Generate all tautomers

• Impossible to calculate lowest energy

tautomer

• Store all tautomers

• [Store all protomers]

Wendy Warr & Associates

Page 33: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Structure Search

Wendy Warr & Associates

Page 34: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Structure Search

• Exact matches done by “flexmatch”,

SMILES, hashcodes etc.

• Substructure search

– Hard to perceive all tautomers for a

substructure

Wendy Warr & Associates

Page 35: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Approaches to Substructure Search

• Address problem at registration stage

– store all tautomers

• Address problem at search stage

– enumerate database structures on the fly

– or enumerate query structure

– or user takes care specifying query

• Combine methods

• Ignore the problem

Wendy Warr & Associates

Page 36: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Depiction of Results

Wendy Warr & Associates

Page 37: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Depicting Results

• Display input, registered structure

• Display matched tautomer

– good approach if substructure is highlighted

• Display standard form

• Let user choose

• Experimental results match displayed

tautomer

Wendy Warr & Associates

Page 38: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

ChemAxon Approaches

• Normalize the structure (“generic

tautomers”)

• Allow for tautomers at search time

• Choose a preferred tautomer

• Customize preferences in Standardizer

Wendy Warr & Associates

Page 39: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

ChemAxon Tautomerization

Plugin• Generates all, dominant and canonical

tautomers

• Calculates canonical tautomer by

empirical rules

• Tries to make canonical tautomer the

dominant tautomer (includes pKa filter)

• Handles dearomatization and

stereochemistry

Wendy Warr & Associates

Page 40: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Customization

• Choose dominant tautomer

– set operating pH

– set maximum distance (# bonds) of a

single proton migration

– protect structural features

• aromaticity, charge, stereochemistry, stable

functional groups

– exclude unstable antiaromatic compounds

Wendy Warr & Associates

Page 41: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

ChemAxon software

• Stores canonical form, or all tautomers

• Enumerates query tautomers (as far as

possible)

• Usually displays structure originally

input

• Optionally displays standard tautomer

Wendy Warr & Associates

Page 42: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Observations

• Computational chemistry companies

– does the ligand match the receptor?

– ligand preparation

– pKa algorithms, rules, energetics

– “rigorous” approaches

• Informatics companies

– does the compound match the patent?

– building registries and inventories

– graph theory

– examples (structures)

– pragmatic approaches

• Hybrid companies

Wendy Warr & Associates

Page 43: Tautomerism in chemical information management systems · 2017-06-27 · Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4

Acknowledgments

• All 28 “vendors”

– including ChemAxon

• Jonathan Brecher

• Geoff Skillman

• Russ Hillard

• Keith Taylor

Wendy Warr & Associates