Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Machine Learning for Drug Development
1ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Marinka ZitnikDepartment of Biomedical InformaticsBroad Institute of Harvard and MITHarvard Data Science Initiative
[email protected]://zitniklab.hms.harvard.edu
OutlineOverview and introduction
Part 1: Virtual drug screening and drug repurposing
Part 2: Adverse drug effects, drug-drug interactions
Part 3: Clinical trial site identification, patient recruitment
Part 4: Molecule optimization, molecular graph generation, multimodal graph-to-graph translation
Part 5: Molecular property prediction and transformers
Demos, resources, wrap-up & future directions2ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Datasets to facilitate algorithmic innovation
3ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Therapeutics are one of most exciting areas for computational
scientists. However,Retrieving, curating, and processing datasets is time-consuming and
requires extensive domain expertise
Datasets are scattered around the bio repositories and there is no centralized data repository for a variety of therapeutics
Many tasks are under-explored in AI/ML communitybecause of the lack of data access
4ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
● Open-Source ML Datasets for Therapeutics:○ Wide range of tasks: target discovery, activity
screening, efficacy, safety, manufacturing○ Wide range of products: small molecules, antibodies,
vaccine, miRNA● Numerous Data Functions:
○ Extensive data functions ○ Model evaluation, data processing and splits,
molecule generation oracles, and much more● 3 Lines of Code:
○ Minimum package dependency, lightweight loaders
5ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Domain scientists
MLscientists
Identify meaningful learning tasks
Design powerful ML models
Advancing algorithms for key therapeutics problems
Our Vision for TDC
6ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Modular Structure of TDC
7
Single-instance
Multi-instance
Generation
Y
Y
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
DrugRes
ADME
Tox
HTS
QM
Yields
Paratope
Epitope
Develop
DTI
DDI
PPI
GDA
DrugSyn
PeptideMHC
AntibodyAff
MTI
Catalyst
Reaction
MolGen
PairMolGen
RetroSyn
67 datasets spread over 22 learning tasks
8ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Model performance evaluators
Data processing helpers
A variety of data splits
Data Functions to Support Your Research
9ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
GuacaMol
MOSES
Literature
Molecule Generation
GeneratedMolecules
Oracle Score
Optimize
GuacaMol: Benchmarking Models for de Novo Molecular Design, J. Chem. Inf. Model., 2019MOSES: A Benchmarking Platform for Molecular Generation Models, Frontiers in Pharmacology, 2020
Molecule Generation Oracles
10ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Leaderboards: Submit your Models
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 11
zitniklab.hms.harvard.edu/TDC
You Are Invited to Join TDC! TDC is an Open-Source, Community Effort
github.com/mims-harvard/TDC
zitniklab.hms.harvard.edu/TDC
12ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Demos, tools, and implementations
13ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
DeepPurpose: Deep Learning Library for Compound and Protein Modeling
DTI, Drug Property, PPI, DDI, Protein Function Prediction
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 14
DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020
https://github.com/kexinhuang12345/DeepPurpose
How can domain scientists interact with AI systems?
15
DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning, NeurIPS 2020ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
MolDesigner: Interactive Design of Drugs with Deep Learning
16http://deeppurpose.sunlab.orgML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
DEMO: DRUG-TARGET INTERACTION PREDICTION
Drug: Remdesivir Remdesivir is indicated for the treatment of adult and pediatric patients aged 12 years and over weighing at least 40 kg for coronavirus disease 2019 (COVID-19) infection requiring hospitalization.
Target protein: Replicase polyprotein 1ab. Multifunctional protein involved in the transcription and replication of viral RNAs
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 17
Molecular structure of Remdesivir Amino acid sequence of Replicase polyprotein 1ab
How can domain scientists interact with AI systems?
18
DeepPurpose: a Deep Learning Library for Drug-Target Interaction Prediction, Bioinformatics 2020MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning, NeurIPS 2020ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Automating Science
19ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Automating Science
20ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
How to explain predictions?Key idea: § Summarize where in the data the model “looks” for
evidence for its prediction§ Find a small subgraph most influential for the prediction
Approach to generate explanations for graph neural networks based on counterfactual reasoning
21GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
GNNExplainer: Key Idea§ Input: Given prediction 𝑓(𝑥) for node/link 𝑥§ Output: Explanation, a small subgraph 𝑀! together
with a small subset of node features:§ 𝑀! is most influential for prediction 𝑓(𝑥)
§ Approach: Learn 𝑀! via counterfactual reasoning§ Intuition: If removing 𝑣 from
the graph strongly decreases the probability of prediction ⇒ 𝑣 is a good counterfactual explanation for the prediction
GNN Explainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019 22ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Examples of Explanations”Will rosuvastatin treat hyperlipidemia? Whatis the disease treatment mechanism?”
New Algorithms: GNNExplainer: Generating Explanations for Graph Neural Networks, NeurIPS 2019New Insights: Discovery of Disease Treatment Mechanisms through the Multiscale Interactome, Nature Communications 2021 (in press)23ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Open challenges and future directions
24ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
Learn about Therapeutics ML!
https://www.drugsymposium.org
Videos from the presentations are now publicly available to everyone through the Symposium Video Channel
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 25
Open Challenges§ Disconnected, uncoupled biomedical knowledge:
§ Challenge: Need to combine data in their broadest sense to close the gap between research and patient data
§ Diverse mechanisms of drug action:§ Challenge: Need to consider diverse mechanisms through which a
drug can treat a disease
§ Novel drugs in development, emerging diseases:§ Challenge: Need to learn and reason about never-before-seen
phenomena
§ Datasets for a variety of therapeutics tasks:§ Challenge: Need datasets and benchmarks to accelerate ML
model development, validation and transition into production and clinical implementation
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021 26
OutlineOverview and introduction
Part 1: Virtual drug screening and drug repurposing
Part 2: Adverse drug effects, drug-drug interactions
Part 3: Clinical trial site identification, patient recruitment
Part 4: Molecule optimization, molecular graph generation, multimodal graph-to-graph translation
Part 5: Molecular property prediction and transformers
Demos, resources, wrap-up & future directions27ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021
28https://zitniklab.hms.harvard.edu/drugml
ML for Drug Development - https://zitniklab.hms.harvard.edu/drugml - Tutorial at IJCAI, Jan 6, 2021