47
Reaction simulation expert systems for synthetic organic chemistry Jonathan H. Chen and Pierre Baldi University of California, Irvine School of Information and Computer Sciences Institute for Genomics and Bioinformatics School of Medicine http://cdb.ics.uci.edu

Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Embed Size (px)

Citation preview

Page 1: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Reaction simulation expert systems for synthetic organic chemistry

Jonathan H. Chen and Pierre Baldi

University of California, Irvine School of Information and Computer Sciences

Institute for Genomics and Bioinformatics School of Medicine

http://cdb.ics.uci.edu

Page 2: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Reaction Prediction

  Given a mixture of reactants and reaction conditions, predict the major products

+ ? NaOMe

Δ

  Fundamental problem-solving skill of expert human chemists

  Critical for applications such as retro-synthesis design and reaction discovery

Page 3: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Need for Reproducible Expertise

buproprion atorvastatin

Automated suggestion of synthetic reactions by pattern matching is straightforward, but “expertise” is knowing which suggestions are actually feasible and reasonable

DCC KMnO4

albuterol

fenofibrate

Pd(0)

CO (gas)

Mg

Page 4: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Transformation Rules

π-bond protic acid addition

carbocationhalide

addition

  Chemical state machine modeling at mechanistic level of detail

  State information: Molecular structure   State transition: Transformation rules

SMIRKS Description

[C:1]=[C:2].[H:3][Cl,Br,I:4]>>[+0:3][C:1][C+:2].[Cl,Br,I;-:4] Alkene, Protic Acid Addition

[C+:1].[Cl,Br,I;-:2]>>[C+0:1][+0:2] Carbocation, Halide Addition

Page 5: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Reaction Explorer [DEMO]

  Product prediction for different reactions but using a common reagent   Sn2 CC1(Oc2cc(c3cc[nH]c3c2O1)CBr)C   Nucleophilic Acylation c1c2c(c(cn1)Cl)CCOC2=O   Robinson Annulation C[C@H]1c2c(cccn2)CCC1=O

  Mechanistic detail explanation of how or why products created

  Use as synthesis workspace   Tylenol #94

Page 6: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 7: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 8: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 9: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 10: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 11: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 12: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 13: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 14: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 15: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 16: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 17: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 18: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 19: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 20: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 21: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 22: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 23: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Results and Progress Expert system with over •  80 reagent models •  1,500 reaction rules •  4,500 validation examples

Subject Categories Implemented •  Substitution and Elimination of Alkyl Halides •  Alcohols and Epoxides •  Alkenes, Electrophilic Addition •  Alkynes, Addition and Acetylide Ions •  Alkanes, Radical Reactions •  Dienes, Conjugation, Diels-Alder •  Electrophilic Aromatic Substitution •  Reactions of Substituted Benzenes •  Oxidation-Reduction Reactions •  Aldehydes and Ketones •  Carboxylic Acid Derivatives •  Enolate Chemistry •  Aldol Chemistry •  Amines and Arenediazonium Reactions •  Transition Metal (Palladium) Catalysis •  SnAr and Benzyne Reactions •  Naphthalene and Heteroaromatic Reactions •  Pericyclic Reactions •  Carbohydrates •  Amino Acid and Peptide Synthesis

J. Chem. Educ. 2008, 85, 1699

Page 24: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Principle-driven Simulations Principle-Driven Simulations   Not based on transformation rules   Driven by principles of physical chemistry Key Components   Core Reaction Unit Model   Scoring Function for Reactions   Chemical Kinetics Simulation

nN π*C-O nO σ*C-Cl

σ* π* p n π σ

Reaction Coordinate

Relative E

nergy

ΔG

ΔG‡

Page 25: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Core Reaction Unit Model   Bond-rearrangement patterns are most typical choice.   These only represent the overall “symptom” of the

reaction and not the underlying mechanistic steps.   Many such patterns must be “memorized” to get

decent coverage.

Sn2 Substitution [CX4H2:1][Br:2]>>[C:1]O

Acyl Substitution (Saponification) [O:2]=[C:1][OH0:3]>>[O:2]=[C:1][O-].[O-:3]

Robinson Annulation [*:3][C:2]1[C:11][C:10][C:9][C:8][C:1]1=[O:20].[C:5][C:4](=[O:12])[C:6]=[C:7]>> [*:3][C:2]12[C:11][C:10][C:9][C:8][C:1]1=[C:5][C:4](=[O:12])[C:6][C:7]2

Page 26: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Core Reaction Unit Model More Favorable Less Favorable

nCl > pC nI > pC

πC=C > σ*H-Br πC=C > σ*H-O

σH-B > π*C=O (ketone) σH-B > π*C=O (amide)

σ*

π*

p

n

π

σ

Molecular Orbital Interactions as Elementary Reaction Steps

Page 27: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Scoring Function for Reactions   Purpose

  Identify favorable reaction steps   Ideally predicts transition state

activation energies (ΔG‡)

  Statistical Machine Learning   Limited quantitative data available   Inspiration from the problem-

solving abilities of human experts   Use qualitative knowledge of

reactivity trends as a major training data source

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Reaction Coordinate

Relative E

nergy ΔG

ΔG‡

C. A. Azencott, M. A. Kayala, P. Baldi, “Learning Scoring Functions for Chemical Expert Systems”

Page 28: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Law of Mass Action Simulation   Results depend on reactivity

scores and concentrations   Reversible reactions driven

by Le Chatelier’s principle

  Catalytic quantities of highly reactive species

  Discrete simulation approximation to bootstrap off incomplete information

Eyring-Evans-Polanyi Equation

  Principled conversion of ΔG‡ to reaction rate constant k with temperature dependence

  Theory only applies for elementary reaction steps

Chemical Kinetics Simulation

Page 29: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Reaction Simulator [DEMO]

  Simple enolate deprotonation   No other input but starting materials, self-

perception of reactive sites and combinations   Kinetic vs. thermodynamic simulator controls

  Complex example evolving over time   Trace full reaction pathway to justify the

prediction, including energy diagram

Page 30: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 31: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 32: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 33: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 34: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 35: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 36: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 37: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 38: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 39: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery
Page 40: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Summary Comparison

Transformation Rules General Principles

Immediately useful results

Development and optimization ongoing

Predictions within seconds

Longer simulation times (minutes)

Only covers what has been programmed into it

Greater potential for generality and discovery

Only provides information on major

product(s)

Kinetics simulations provide information on

major and minor pathways

Page 41: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Acknowledgements

  Prof. Pierre Baldi (ICS)   Prof. Elizabeth Jarvo (Chem)   Dr. Susan King (Chem)   Prof. Greg Weiss (Chem)   Prof. David Van Vranken (Chem)   Prof. James Nowick (Chem)

  NIH/NLM Biomedical Informatics Training Grant   UCI Medical Scientist Training Program   Orange County ARCS® Foundation

Students   Chloe Azencott   Matt Kayala   Paul Rigor   UCI Students

http://cdb.ics.uci.edu

Academic Software   OpenEye Software   ChemAxon Software   Peter Ertl, Novartis

(JME Editor)

Course Instructors   Prof. Suzanne Blum   Prof. Zhibin Guan   Prof. Larry Overman   Prof. Ken Shea   Dr. Mare Taagepera   Prof. Chris Vanderwal

Page 42: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Extend Orbital Chaining

  Interactions between a small set of fundamental orbital types dominate organic reactivity

  Higher order interactions can be composed by chaining fundamental units together

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

nO > πC=C > π*C=O nN > σ*H-C > σ*C-Br

Page 43: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Need for Reactivity Prediction

  Retro-synthetic analysis usually only suggests precursors, but does not account for unintended reactivity

  Existing systems may use exclusion rules   Best to reproduce forward sequence of

suggested reactions to ensure reliability

Page 44: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Taxol Anti-cancer

Yew Tree Sap

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Morphine Pain Medication Opium Poppies

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Penicillin G Antibiotic Fungus

Motivation

  Total Synthesis of important drugs and chemicals

Andrimid Anti-Tuberculosis Lead Compound

  Chemical Modification to optimize lead compounds

  Goal / Hypothesis: Can a computer expert system reproduce the core problem-solving capabilities needed of human chemists?

Page 45: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

SMILES Extensions

  Atom Mapping   Necessary to map reactant to product atoms   Proper transform requires balanced stoichiometry

  Hydrogens generally must be explicitly specified

Carboxylic acid + [O:1]=[C:2]([*:9])[O:3][H:7]. Primary amine [H:8][N:4]([*:10])[H:5]>> Amide + [O:1]=[C:2]([*:9])[N:4]([*:10])[H:5]. Water [H:7][O:3][H:8]

R1

O

OH

NH-R2 H +

R1

O + H2O

NH-R2

1

2

9 3 7

8 4 5 10

1

2

7,8 3

9 4 5 10

Page 46: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Molecular Orbital List

Filled •  sp2 O •  π CO •  … Unfilled •  π* CO •  σ* HC π* CO •  …

Filled •  sp3 O •  σ CO •  … Unfilled •  σ* HO •  σ* CO •  …

Filled •  sp3 O •  sp2 O •  … Unfilled •  π* SO •  σ* HO π* SO •  …

Page 47: Reaction simulation expert systems for synthetic …acscinf.org/docs/meetings/237nm/presentations/237nm84.pdfCritical for applications such as retro-synthesis design and reaction discovery

Outline   Motivation

  Need for reactivity prediction   Rules-based Predictor Capabilities

  Predictive general reagents (NaOH), with mechanism explanations   Synthesis workspace (tylenol #94)

  Principle-based Functional Demo Intent   Complex example evolving over time   Kinetic vs. thermodynamic example to illustrate simulation controls

  Fundamental Reaction Unit Model   Chaining: Retain simple set of fundamental orbitals, then just compose for

higher order   Scoring Interactions

  Qualitative Knowledge vs. Quantitative Data   Simulations

  Chemical Kinetics   Discrete model for bootstrapping from incomplete starting information

  Rules vs. Principles   Ongoing Work

  Parameter development for more reactivity classes