Upload
doandiep
View
217
Download
3
Embed Size (px)
Citation preview
Reaction simulation expert systems for synthetic organic chemistry
Jonathan H. Chen and Pierre Baldi
University of California, Irvine School of Information and Computer Sciences
Institute for Genomics and Bioinformatics School of Medicine
http://cdb.ics.uci.edu
Reaction Prediction
Given a mixture of reactants and reaction conditions, predict the major products
+ ? NaOMe
Δ
Fundamental problem-solving skill of expert human chemists
Critical for applications such as retro-synthesis design and reaction discovery
Need for Reproducible Expertise
buproprion atorvastatin
Automated suggestion of synthetic reactions by pattern matching is straightforward, but “expertise” is knowing which suggestions are actually feasible and reasonable
DCC KMnO4
albuterol
fenofibrate
Pd(0)
CO (gas)
Mg
Transformation Rules
π-bond protic acid addition
carbocationhalide
addition
Chemical state machine modeling at mechanistic level of detail
State information: Molecular structure State transition: Transformation rules
SMIRKS Description
[C:1]=[C:2].[H:3][Cl,Br,I:4]>>[+0:3][C:1][C+:2].[Cl,Br,I;-:4] Alkene, Protic Acid Addition
[C+:1].[Cl,Br,I;-:2]>>[C+0:1][+0:2] Carbocation, Halide Addition
Reaction Explorer [DEMO]
Product prediction for different reactions but using a common reagent Sn2 CC1(Oc2cc(c3cc[nH]c3c2O1)CBr)C Nucleophilic Acylation c1c2c(c(cn1)Cl)CCOC2=O Robinson Annulation C[C@H]1c2c(cccn2)CCC1=O
Mechanistic detail explanation of how or why products created
Use as synthesis workspace Tylenol #94
Results and Progress Expert system with over • 80 reagent models • 1,500 reaction rules • 4,500 validation examples
Subject Categories Implemented • Substitution and Elimination of Alkyl Halides • Alcohols and Epoxides • Alkenes, Electrophilic Addition • Alkynes, Addition and Acetylide Ions • Alkanes, Radical Reactions • Dienes, Conjugation, Diels-Alder • Electrophilic Aromatic Substitution • Reactions of Substituted Benzenes • Oxidation-Reduction Reactions • Aldehydes and Ketones • Carboxylic Acid Derivatives • Enolate Chemistry • Aldol Chemistry • Amines and Arenediazonium Reactions • Transition Metal (Palladium) Catalysis • SnAr and Benzyne Reactions • Naphthalene and Heteroaromatic Reactions • Pericyclic Reactions • Carbohydrates • Amino Acid and Peptide Synthesis
J. Chem. Educ. 2008, 85, 1699
Principle-driven Simulations Principle-Driven Simulations Not based on transformation rules Driven by principles of physical chemistry Key Components Core Reaction Unit Model Scoring Function for Reactions Chemical Kinetics Simulation
nN π*C-O nO σ*C-Cl
σ* π* p n π σ
Reaction Coordinate
Relative E
nergy
ΔG
ΔG‡
Core Reaction Unit Model Bond-rearrangement patterns are most typical choice. These only represent the overall “symptom” of the
reaction and not the underlying mechanistic steps. Many such patterns must be “memorized” to get
decent coverage.
Sn2 Substitution [CX4H2:1][Br:2]>>[C:1]O
Acyl Substitution (Saponification) [O:2]=[C:1][OH0:3]>>[O:2]=[C:1][O-].[O-:3]
Robinson Annulation [*:3][C:2]1[C:11][C:10][C:9][C:8][C:1]1=[O:20].[C:5][C:4](=[O:12])[C:6]=[C:7]>> [*:3][C:2]12[C:11][C:10][C:9][C:8][C:1]1=[C:5][C:4](=[O:12])[C:6][C:7]2
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Core Reaction Unit Model More Favorable Less Favorable
nCl > pC nI > pC
πC=C > σ*H-Br πC=C > σ*H-O
σH-B > π*C=O (ketone) σH-B > π*C=O (amide)
σ*
π*
p
n
π
σ
Molecular Orbital Interactions as Elementary Reaction Steps
Scoring Function for Reactions Purpose
Identify favorable reaction steps Ideally predicts transition state
activation energies (ΔG‡)
Statistical Machine Learning Limited quantitative data available Inspiration from the problem-
solving abilities of human experts Use qualitative knowledge of
reactivity trends as a major training data source
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Reaction Coordinate
Relative E
nergy ΔG
ΔG‡
C. A. Azencott, M. A. Kayala, P. Baldi, “Learning Scoring Functions for Chemical Expert Systems”
Law of Mass Action Simulation Results depend on reactivity
scores and concentrations Reversible reactions driven
by Le Chatelier’s principle
Catalytic quantities of highly reactive species
Discrete simulation approximation to bootstrap off incomplete information
Eyring-Evans-Polanyi Equation
Principled conversion of ΔG‡ to reaction rate constant k with temperature dependence
Theory only applies for elementary reaction steps
Chemical Kinetics Simulation
Reaction Simulator [DEMO]
Simple enolate deprotonation No other input but starting materials, self-
perception of reactive sites and combinations Kinetic vs. thermodynamic simulator controls
Complex example evolving over time Trace full reaction pathway to justify the
prediction, including energy diagram
Summary Comparison
Transformation Rules General Principles
Immediately useful results
Development and optimization ongoing
Predictions within seconds
Longer simulation times (minutes)
Only covers what has been programmed into it
Greater potential for generality and discovery
Only provides information on major
product(s)
Kinetics simulations provide information on
major and minor pathways
Acknowledgements
Prof. Pierre Baldi (ICS) Prof. Elizabeth Jarvo (Chem) Dr. Susan King (Chem) Prof. Greg Weiss (Chem) Prof. David Van Vranken (Chem) Prof. James Nowick (Chem)
NIH/NLM Biomedical Informatics Training Grant UCI Medical Scientist Training Program Orange County ARCS® Foundation
Students Chloe Azencott Matt Kayala Paul Rigor UCI Students
http://cdb.ics.uci.edu
Academic Software OpenEye Software ChemAxon Software Peter Ertl, Novartis
(JME Editor)
Course Instructors Prof. Suzanne Blum Prof. Zhibin Guan Prof. Larry Overman Prof. Ken Shea Dr. Mare Taagepera Prof. Chris Vanderwal
Extend Orbital Chaining
Interactions between a small set of fundamental orbital types dominate organic reactivity
Higher order interactions can be composed by chaining fundamental units together
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
nO > πC=C > π*C=O nN > σ*H-C > σ*C-Br
Need for Reactivity Prediction
Retro-synthetic analysis usually only suggests precursors, but does not account for unintended reactivity
Existing systems may use exclusion rules Best to reproduce forward sequence of
suggested reactions to ensure reliability
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Taxol Anti-cancer
Yew Tree Sap
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Morphine Pain Medication Opium Poppies
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Penicillin G Antibiotic Fungus
Motivation
Total Synthesis of important drugs and chemicals
Andrimid Anti-Tuberculosis Lead Compound
Chemical Modification to optimize lead compounds
Goal / Hypothesis: Can a computer expert system reproduce the core problem-solving capabilities needed of human chemists?
SMILES Extensions
Atom Mapping Necessary to map reactant to product atoms Proper transform requires balanced stoichiometry
Hydrogens generally must be explicitly specified
Carboxylic acid + [O:1]=[C:2]([*:9])[O:3][H:7]. Primary amine [H:8][N:4]([*:10])[H:5]>> Amide + [O:1]=[C:2]([*:9])[N:4]([*:10])[H:5]. Water [H:7][O:3][H:8]
R1
O
OH
NH-R2 H +
R1
O + H2O
NH-R2
1
2
9 3 7
8 4 5 10
1
2
7,8 3
9 4 5 10
Molecular Orbital List
Filled • sp2 O • π CO • … Unfilled • π* CO • σ* HC π* CO • …
Filled • sp3 O • σ CO • … Unfilled • σ* HO • σ* CO • …
Filled • sp3 O • sp2 O • … Unfilled • π* SO • σ* HO π* SO • …
Outline Motivation
Need for reactivity prediction Rules-based Predictor Capabilities
Predictive general reagents (NaOH), with mechanism explanations Synthesis workspace (tylenol #94)
Principle-based Functional Demo Intent Complex example evolving over time Kinetic vs. thermodynamic example to illustrate simulation controls
Fundamental Reaction Unit Model Chaining: Retain simple set of fundamental orbitals, then just compose for
higher order Scoring Interactions
Qualitative Knowledge vs. Quantitative Data Simulations
Chemical Kinetics Discrete model for bootstrapping from incomplete starting information
Rules vs. Principles Ongoing Work
Parameter development for more reactivity classes