Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Optimal Transport Graph Neural Networks
Graph Convolutions and Chemprop Model
Toxic?+
Molecule embedding
Graph Convolutional
Network / Chemprop
(GCN)
2
Property smoothness & robustness of representations
Small, soluble Large, soluble
Large, insoluble
3
Solu
bilit
y
Molecule embedding distance
Pro
perty
val
ue d
ista
nce
4
Property smoothness & robustness of representations
From atom embeddings to molecule embedding
Toxic?+
• Simplistic graph compression
•
5
Toxic?+
Green, red and orange sets have the same sum and same average.
6
• Simplistic graph compression
•
From atom embeddings to molecule embedding
Toxic?
Better set functions?
7
From atom embeddings to molecule embedding
Prototypes Inspired From Functional Groups
Prototype 1: Aromatic Nitro
Prototype 3: Nitroso
Prototype 2: Azo-type
Prototype 3: Aromatic Amine
dist=0.5
dist=2
dist=10
dist=20
8
Our idea: • learn a dictionary of structural basis functions that serve to highlight key facets of compounds • express molecules by relating them to abstract molecular prototypes • prototypes highlight property values associated with different structural features (e.g. solubility)
Prototypes as GCN Embedding Point Clouds
Prototype 1: Aromatic Nitro
Prototype 3: Nitroso
Prototype 2: Azo-type
Prototype 3: Aromatic Amine
dist=0.5
dist=2
dist=10
dist=20
9
Our idea: • learn a dictionary of structural basis functions that serve to highlight key facets of compounds • express molecules by relating them to abstract molecular prototypes • prototypes highlight property values associated with different structural features (e.g. solubility)
Wasserstein Prototypes
Molecule atom embeddings
Prototype 1Prototype 2
Prototype 3Prototype 4
d4
d1
d3
d2
d1d2d3d4
Toxic? 10
Wasserstein Prototypes
11
Wasserstein distances between sets of embeddings
Molecule atom embeddings
Prototype 1 Prototype 2
Prototype 3Prototype 4
d4
d1
d3
d2
d1d2d3d4
Toxic?
Property smoothness: Latent Space for Real Models
Graph Convolution w/ sum aggregation
Spearman ⍴ = .424±.029
Pearson r = .393±.049
Wasserstein Prototypes
Spearman ⍴ = .815±.026
Pearson r = .828±.020
Molecule embedding in 2D vs property value
Molecule embedding distance vs property value distance
12
13
Property smoothness: Latent Space for Real Models
14
Property smoothness: Latent Space for Real Models
Results on Molecular Property Prediction
Solubility LipophilicityInhibitors of human
β-secretase 1 (BACE-1)
Blood-brain barrier
penetration (permeability)
15