Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
Deep Reinforcement Learning
for
Pre-Clinical Drug Development
Fellow: Nathan Russell
Advisors: Jian Peng (CS), Marty Burke (Chemistry)
Big Picture
1.15 Billion 1.53 Billion2017 $Capitalized CostsAccording to Tufts CSDD 2014 Direct Impact Indirect Impact
Optimize ReactionHow do we make it efficiently?
Synthesis PlanningHow do we make the molecule?
Candidate Search / Drug DiscoveryCan we find the ideal molecule out of the infinite possible?
Candidate EvaluationHow do we evaluate if molecule meets our needs?
Target Identification and ValidationWhat does the molecule need to do?
Pre-Clinical TasksM
ach
ine
Lear
nin
g
Resources & Contributions
Contributions
Generative Multitask Molecular Network
Model Assisted Bayesian Optimization
Graph Translation Policy Network
Resources
Known Reactions~10e6 examples Automated Lab
Evaluation
Structure Only
~10e5 examples
Structure + labels
~10e4 examples
Target Identification and Validation (What I did 16-17)
➢ Gene / Protein Expression Studies
➢ Biomarker Discovery
➢ Biochemical and Cellular Pathway Analysis
➢ Cell Analysis
Heterogeneous Network Embedding Scalable Manifold Embedding and Viz
Interpretable Pattern Discovery
Real Time Filtering and Querying of Biological Networks
Deep Graph Embedding
Common Tasks Tools made to Support those Tasks
Insilco Candidate Evaluation
Molecule
Prediction
Experimentation
Neural Network
Ground Truth
Molecule
Binds to Protein (Yes / No)
Probability of Binding (0,1)
Generative Multitask Molecular Network (GMMN)
Novel Tree Encoder / Decoder Network+ Regularization of Multiple Roots
Supervised + Unsupervised Joint TrainingVariational Autoencoder
Graph Translation Policy Network with External Memory
Policy Network Chemical Environment
Action
Rearrange Bonds / Atoms
Deep Reinforcement Learning
RewardStructural Similarity, Efficiency, Functional
Efficacy
English ▻中文
Reactants ▻ Products
Folded ▻ Unfolded
Good ▻ Great Molecule
Machine Translation
✓ No Compression Bottleneck
✓ Arbitrary Input / output
✓ Reward Signal better than character level cross entropy
Candidate Search: 3 ways to discover
Multi-Stage Virtual Screening
Search as RecursiveTranslation
Generative Multitask Molecular Network
Model Assisted Bayesian Optimization
Graph Translation Policy Network
Latent SpaceSearch
Current Good Great
Simulation
Classifier / Regressor
Rules Sets
NEW
2016
Pre2016
Improves
Improves
Enables for the 1st Time
Synthesis Planning
Automation Friendly Natural Product Synthesis Library*
Generalized Synthetic Planning
Linear natural
products
FRAGMENTS
from all allowed
retrosyntheses
Smallest set of
redundant FRAGMENTS
that cover most natural
product chemical space
Double the
FRAGMENTS to
arrive at blocks
• New Heuristic and Combinatorial Optimization
• New Subgraph Isomorphism Clustering algorithms
* Martin Burke, Andrea Palazzo, & Claire Simons are leading this endeavor
Graph Translation Policy Network
• MCTS based Retrosynthetic Planning can only use existing reactions
• New molecules will require new reactions and the GTPN can be used as a conditional generative model to propose new reactions given end points
Optimize Synthesis
Sequential Decision Making Agent
High Dimensional Sparse Binary Representation
Low Dimensional Dense RealRepresentation
Model Assisted ParallelBayesian Optimization
Lab Automation
I. Pretrained model jointly optimizes over and learns latent distribution
II. Parallel Bayesian framework enables batch style lab automation
III. Learns within & between experiment(s)
𝑵𝟐 Scaling 𝑵 𝒙 𝑫 Scaling
A
B C
A
B C
Metric Space Properties
ScalingN = # of Molecules
D = Latent Dimensionality