Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
9/19/2001, Page 1
2001 Herman Skolnik Award Symposium
Computer-Assisted Applications for the Practicing Chemist
Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer
222nd National Meeting of the American Chemical SocietyAugust 26 – 30, Chicago, ILL
Guenter Grethe
9/19/2001, Page 2
It all started here..
1961
Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer
…in the basement of the chemistry building at the
Technical University Braunschweig
Cl
OCH3
CH3
O O
OH
CONH2
9/19/2001, Page 3
Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer
Princeton Computer Chemistry Laboratory (1972)
DEC rules!
W. Todd Wipke and Robert Langridge
Entering the world of computer-assisted synthesis planningand 3D- modeling of small and large molecules
…..and seeing the light….
9/19/2001, Page 4
Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer
NATO Advanced Study Institute, Noordwijkerhout, Netherlands - 1973
Organizers: W. Todd Wipke, Stephen R. Heller, Richard J. Feldmann, Ernest Hyde
The education continues….
9/19/2001, Page 5
Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer
The twilight zone
Chemistry, wet or dry?
…but is has to be synthetic chemistry!
ca. 1975
9/19/2001, Page 6
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
CASP at ROCHE
9/19/2001, Page 7
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Convincing management – preaching the gospel
CHEMIST
CAS, ISI, etc. ONLINE
SCIENTIFIC LIBRARY
REACTIONDATABASES
SYNTHESISPLANNING
COMMERCIAL CHEMICALS
CORPORATEDATABASES
9/19/2001, Page 8
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
The result: MACCS and
Applications
REACCS at ROCHE and….
“You want to use a HONEYWELL? – No problem, the programs were developed on a PRIME”
Yeah, sure. REACCS never ran on the HONEYWELL!!!
The solution: ROCHE bought a VAX!
9/19/2001, Page 9
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
…our own little computer room
…developing MACCS-based application for the scientists.
User friendly???
9/19/2001, Page 10
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
1985 The Big Change!move to Californiaworking for a young and exciting companyworking in an area that interested me most traveling, seminars and ‘preaching the gospel’ to my peersbeing closer to the fulfillment of my dreams
Add Re-define ‘User-friendly’Seamless integration of data
Eliminate the obstacles that prevent synthetic chemists to use available tools AND use them
effectively.
The challenge:
Example: Managing reaction information
ca.1980
9/19/2001, Page 11
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Major problemsQuery formulation processEvaluation of search resultsLimited involvement of synthetic chemist
Potential solutionsTools that simulate the problem solving processUser interfaces based on users’ tasks and capabilitiesSimplification of the querying processEffective indexing of databasesEfficient post-search management toolsSeamless integration of various information sources Improved non-structural searches, e.g. hierarchical thesauri for keywords
Most importantly: Recognition of the vast knowledge of synthetic chemists
What are the major problems and potential solutions?
9/19/2001, Page 12
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Problems associated with structural searches
N
O
CH3O
O
O
N
O
CH3O
O
O
Synthetic Problem:
Full Structure Search: No hitsReaction Substructure Search (colored fragment): 188 hits!
Data Source: MDL’s combined reaction databases (ca. 950K reactions)
Keyword Search “Michael Addition”: 3338 hits!!
Solution: Indexing of reactions
9/19/2001, Page 13
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
The indexing is based on changes occurring at atoms and bonds involved in the reaction (reaction center) and the immediate vicinity (alpha and beta atoms) and is expressed as a hashcodeIt is an important tool to advance the involvement of chemists in the retrieval processIts uses include:
Clustering reactions of the same typePost-management of large hitlistFacilitating query formulation (Transformation Searches)Linking of reaction information from different sources
MDL licenses InfoChem’s RCP program to classify all reaction databases.
Reaction Classification Based on Reaction Centers(Reaction Type)
RCP program from InfoChem, Munich
9/19/2001, Page 14
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
N
CNH
H
N
CN 0-Sphere (Broad)
Reaction centers only, similar to broadlybased substructure search
large-sized cluster or hitlist
1-Sphere (Medium)Reaction centers plus alpha atoms,excluding hydrogens
medium-sized cluster or hitlist
2-Sphere (Narrow)
N
CC
N
N
CC
NH
H
NC
CN
NC
CNH
HReaction centers plus beta atoms,excluding consecutive sp3-atoms
small-sized cluster or hitlist
Number of hits from CIRX97 (70060 rxns) for identical transformation at different classification levels
O
O
OH
OH
...655778
...151297
...077692
Number of hits
Topological specificity
700
300
50
broad
medium
narrow
Definitions of RCP Classification
9/19/2001, Page 15
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Reaction Classification as Post-Search Management ToolClassification codes are data
stored in the databaseusable for sorting (clustering)
N
O
CH3O
O
O
N
O
CH3O
O
O
N
O
O
O N
O
O
O
H
H
Chiral
RSS-Search Query:(in red)
Result: 188 hits
Clustered byClassification Code “MEDIUM”
90 clusters
1.Cluster (21 rxns)
N O
OO
O
ON
O
O O
2.Cluster (15 rxns)
NO
OO O
NO
OO
O
Chiral
O
O O O
OO
HH
4.Cluster (8 rxns)
Result: Large hitlist reduced to 21 relevant reactions in 1.cluster - elimination of noise
9/19/2001, Page 16
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Reaction Classification as an Effective Querying Tool
Query Form of MDL’s Reaction Browser:
Result of ‘Same Transformation’ Search: 24 hits
Examples:
CH3O OCH3
N
CH3O
O
OCHO
N
CH3O OCH3
O
CH3O
O
CHO
H
H
DBU
.ret.N
N
CO2CH3
O
Chiral
.ret.N
N
H3CO2C
OH
Chiral
Pyrrolidine
Eliminates problem of drawingefficient RSS-queries, easier to understand by end-user chemistCalculates classification code of drawn reactions on-the-flyRetrieves all reactions of the same reaction type without the noise ofnormal RSS searches
Result: Chemists do not have to struggle with formulating the most efficient RSS query to find relevant examples
9/19/2001, Page 17
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Strategy Meeting in MunichWhere else can reaction classification be used to
benefit the synthetic chemist?Facilitating access to and linking of information sources, of course!
9/19/2001, Page 18
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Access to Information Sources
Access must simulate chemists’ information gathering processAccess must be multi-directional, fast and seamless to
primary sources (journals, laboratory notebooks, etc.)secondary sources (databases)tertiary sources (major reference works, review articles etc.)other data (catalogues, spectral data, etc.)
All sources must be interlinked and accessible from any sourceIntra- or internet as the primary medium
Information Triangle primary sources
tertiary sourcessecondary sources
Requirements
9/19/2001, Page 19
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Reaction Databases (ISIS/Host)(MDL, Third Party, Proprietary etc.)
Tertiary Sources(COFGT, EROS, CAC,etc) Primary Journals
ReactionClassification Codes Reaction
Classification Codes
LitLink (citations)
LitLink (citations)
Rxn Class. Codes, citations, structures
Future links
The integrated Major Reference Works (iMRW) Project
9/19/2001, Page 20
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
iMRW – Linking of reaction information
The beginnings of a personal electronic library?
Information from “Comprehensive AsymmetricCatalysis” (Springer Verlag)
stereochemistry,mechanism
Information from“Enzyclopedia of Reagents for
Organic Synthesis”(Wiley & Sons)
9/19/2001, Page 21
Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer
Computers will never be a substitute for the creativity and experienceof synthetic chemists, it will just make them more efficient…
…provided they are being given the right tools
Much has been achieved over the last three decades, but we still have a long way to go!