Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1/21
Introduction Describing inorganic complexes Similarity and model uncertainty
ML for inorganic molecular design:descriptors and similarity in transition metal
chemical space
Jon Paul Janet 1 Heather Kulik 1
1Department of Chemical Engineering, Massachusetts Institute of Technology
255th ACS National Meeting, New Orleans
03.19.18
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular designGomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
L
Bignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
2/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data-driven molecular design
Gomez-Bombarelli, R. et al.. Nat.
Mater., 15(10):1120-1127, 2016.
OLED chemical space
NN∼ 106
DFT∼ 105
Exp.∼ 101
Ma, X. et al. J. Phys. Chem. Lett., (18):3528-3533, 2015.
Machine learningis transforminghow we designnew materials...
L
M
L
L
L
L
LBignozzi, C. et al. Coord. Chem. Rev., 257(9), 2013.
NN
N N
Pt
Cl
Cl
Periana, R. A. et al. Science, 280(5363), 1998.
...what about inorganic molecular complexes?
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
3/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Transition metal complexes
t2g
eg
Energy
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L < 0
low spin
high spin
∆EH−L > 0
low spin
high spin∆EH−L ∼ 0
perturbation, ∆T
M2+
M3+
e
∆EIII−II
4/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How to estimate properties?
property
features
experiment
HΨ = EΨdensity functional theory (DFT)
model
weeks, months
days
seconds
4/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How to estimate properties?
property
features
experiment
HΨ = EΨdensity functional theory (DFT)
model
weeks, months
days
seconds
4/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How to estimate properties?
property
features
experiment
HΨ = EΨdensity functional theory (DFT)
model
weeks, months
days
seconds
4/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How to estimate properties?
property
features
experiment
HΨ = EΨdensity functional theory (DFT)
model
weeks, months
days
seconds
4/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How to estimate properties?
property
features
experiment
HΨ = EΨdensity functional theory (DFT)
model
weeks, months
days
seconds
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
5/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Input space design
What would be the ideal feature space?
Chemical Space Cf
ci
Descriptor Space X ⊂ Rd
xi
xj
cj
d(xi , xj)
Good descriptors:• cheap• small as possible• preserve similarity
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
B3LYP-like DFTHF exchange in 0-30%gas phase optimizatonLANL2DZ/6-31G*high- and low-spinM(II)/(III)
1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
Coulomb matrix eigenspec-trum (CM-ES) descriptor &kernel ridge regression (KRR)
1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
Coulomb matrix eigenspec-trum (CM-ES) descriptor &kernel ridge regression (KRR)
∆EH-L RMSECM-ES 19.2 kcal/mol
1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
6/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Data for spin splitting
Data for octahedral complexes1:
M
Lax
Lax
Leq
Leq
Leq
Leq
1345 (194)complexes
7 HF values
Coulomb matrix eigenspec-trum (CM-ES) descriptor &kernel ridge regression (KRR)
∆EH-L RMSECM-ES 19.2 kcal/mol
Why?1Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
7/21
Introduction Describing inorganic complexes Similarity and model uncertainty
A tale of two complexes
PC 1
PC
2
PC 1
PC
2
∆EH−L size
Fe[pisc]3+6 Fe[misc]3+6
7/21
Introduction Describing inorganic complexes Similarity and model uncertainty
A tale of two complexes
PC 1
PC
2
PC 1
PC
2
∆EH−L size
Fe[pisc]3+6
∆EH-L = 40.7 kcal/mol
Fe[misc]3+6
∆EH-L = 37.7 kcal/mol
7/21
Introduction Describing inorganic complexes Similarity and model uncertainty
A tale of two complexes
PC 1
PC
2
PC 1
PC
2
∆EH−L size
Fe[pisc]3+6 Fe[misc]3+6
7/21
Introduction Describing inorganic complexes Similarity and model uncertainty
A tale of two complexes
PC 1
PC
2
PC 1
PC
2
∆EH−L size
Fe[pisc]3+6 Fe[misc]3+6
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)
metalproperties
local ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)metal
properties
local ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)
max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)metal
propertieslocal ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55
Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)metal
propertieslocal ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)
metalproperties
local ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)
metalproperties
local ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)
metalproperties
local ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
8/21
Introduction Describing inorganic complexes Similarity and model uncertainty
MCDL-25
mixed continuous discrete lcoal (MCDL)
metalproperties
local ligandproperties
global ligandproperties
identity
oxidation state
Fe(II)max ∆χ
χ = 3.44
χ = 2.55Kier index
0
5
10
15
20
CM−ES MCDLmethod
test
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48
d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48
d1 : 48 + ∑C,O
ZOZC = 144 + 48
d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48
d1 : ∑i
∑j
ZiZj δ(di,j , 1)
dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations2
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)
dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
2Broto, P., Moreau, G. and Vandycke, C. Eur. J. Med. Chem., 19(1):71-78,1984.
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?
restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZO
d2 : ∑M,C
ZMZCd3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZO
d2 : ∑M,C
ZMZC
d3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZC
d3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZC
d3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZC
d3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZC
d3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S
∼ 160 features in total
9/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Extensible, continuous descriptors - RACs
Based on autocorrelations
OO
OO
C C
M
d1 : ∑O,C
ZOZC = 48d1 : 48 + ∑C,O
ZOZC = 144 + 48d1 : ∑i
∑j
ZiZj δ(di,j , 1)dx : ∑i
∑j
ZiZj δ(dij , x)
0 1 2 3 4 5 6maximum AC depth
8
10
12
14
16
18
MU
E (
kc
al/m
ol)
traintest
*
How to adapt to TM complexes?restrict the scope to focus onnear-metal atoms
d1 : ∑M,O
ZMZOd2 : ∑M,C
ZMZC
d3 : ∑M,O
ZMZO
(Zi − Zj)
properties:T ,χ,Z ,I,S
∼ 160 features in total
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
10/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Feature selection
MCDL
RAC155UV86
RFE43
LS28
rF41
1.5
2.0
2.5
3.0
3.5
4.0
50 100 150
dimension
RM
SE
, kca
l/mol
Janet, J.P., and Kulik, H.J. J. Phys. Chem. A, 2017,121, 46, 8939-8954.
11/21
Introduction Describing inorganic complexes Similarity and model uncertainty
A tale of two complexes, II
PC 1
PC
2
PC 1
PC
2
PC 1
PC
2
PC 1
PC
2
∆EH−L size
11/21
Introduction Describing inorganic complexes Similarity and model uncertainty
A tale of two complexes, II
PC 1
PC
2
PC 1
PC
2
PC 1
PC
2
PC 1
PC
2
∆EH−L size
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
metal
N
N
NN
CC
C
C
C
C
CC
CC
HH
CC
CC
HH
CC
C
C
H
H
C
C
H
H
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
spin splitting (randF) spin splitting (randF)
bond lengths (randF) redox (randF)
more ‘electronic’
more ‘topological’
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
spin splitting (randF) spin splitting (randF)
bond lengths (randF) redox (randF)
more ‘electronic’
more ‘topological’
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
spin splitting (randF)
spin splitting (randF)
bond lengths (randF) redox (randF)
more ‘electronic’
more ‘topological’
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
spin splitting (randF) spin splitting (randF)
bond lengths (randF) redox (randF)
more ‘electronic’
more ‘topological’
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
spin splitting (randF) spin splitting (randF)
bond lengths (randF)
redox (randF)
more ‘electronic’
more ‘topological’
12/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Do features depend on properties?
spin splitting (randF) spin splitting (randF)
bond lengths (randF) redox (randF)
more ‘electronic’
more ‘topological’
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?
random forest selected for redox
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?random forest selected for redox
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?random forest selected for redox
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?random forest selected for redox
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?
random forest selected for redox
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?
random forest selected for redox
13/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
PC 1
PC
2
357911
E0 (eV)
?
random forest selected for redox
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
=
?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]
∆G = 5.3 eV
Co(II) [CO]5 [pyr]
∆G = 8.1 eVFe(II) [CO]4 [pyr][water]
∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
14/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Mapping TM complex space
+
2
= ?
Cr(II) [H2O]5 [misc]∆G = 5.3 eV
Co(II) [CO]5 [pyr]∆G = 8.1 eV
Fe(II) [CO]4 [pyr][water]∆G = 7.8 eV
15/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Test-set performance is not necessarily a good metric for generaltransferability2:
Fe(III)
−25
0
25
50
pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS
∆EH
−L k
cal/m
ol
ANN
B3LYP
Fe(III)[pisc]6
0
20
40
60
0.0 0.1 0.2 0.3HFX, %
∆EH
−L k
cal/m
ol
ANN
DFT
3.132.97
0
5
10
15
train test
abs.
err
or
(kca
l/mo
l)
0
10
20
30
train test CSD
abs.
err
or
(kca
l/mo
l)
2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
15/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Test-set performance is not necessarily a good metric for generaltransferability2:
Fe(III)
−25
0
25
50
pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS
∆EH
−L k
cal/m
ol
ANN
B3LYP
Fe(III)[pisc]6
0
20
40
60
0.0 0.1 0.2 0.3HFX, %
∆EH
−L k
cal/m
ol
ANN
DFT
3.132.97
0
5
10
15
train test
abs.
err
or
(kca
l/mo
l)
0
10
20
30
train test CSD
abs.
err
or
(kca
l/mo
l)
2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
15/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Test-set performance is not necessarily a good metric for generaltransferability2:
Fe(III)
−25
0
25
50
pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS
∆EH
−L k
cal/m
ol
ANN
B3LYP
Fe(III)[pisc]6
0
20
40
60
0.0 0.1 0.2 0.3HFX, %
∆EH
−L k
cal/m
ol
ANN
DFT
3.132.97
0
5
10
15
train test
abs.
err
or
(kca
l/mo
l)
0
10
20
30
train test CSD
abs.
err
or
(kca
l/mo
l)
2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
15/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Test-set performance is not necessarily a good metric for generaltransferability2:
Fe(III)
−25
0
25
50
pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS
∆EH
−L k
cal/m
ol
ANN
B3LYP
Fe(III)[pisc]6
0
20
40
60
0.0 0.1 0.2 0.3HFX, %
∆EH
−L k
cal/m
ol
ANN
DFT
3.132.97
0
5
10
15
train test
abs.
err
or
(kca
l/mo
l)
0
10
20
30
train test CSD
abs.
err
or
(kca
l/mo
l)
2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
15/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Test-set performance is not necessarily a good metric for generaltransferability2:
Fe(III)
−25
0
25
50
pisc−pisc pisc−NCSpisc−H2O pisc−Cl H2O−H2O Cl−Cl NCS−NCS
∆EH
−L k
cal/m
ol
ANN
B3LYP
Fe(III)[pisc]6
0
20
40
60
0.0 0.1 0.2 0.3HFX, %
∆EH
−L k
cal/m
ol
ANN
DFT
3.132.97
0
5
10
15
train test
abs.
err
or
(kca
l/mo
l)
0
10
20
30
train test CSD
abs.
err
or
(kca
l/mo
l)
2Janet, J.P., and Kulik, H.J. Chem. Sci., 2017, 8, 5137-5152.
16/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Uncertainty estimates are essential for our surrogate model toexplore chemical space:
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l)
Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:
var (y∗|x∗) ≈ 1J ∑j yT
j yj + τ−1
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l)
0
10
20
30
0.5 1.0 1.5 2.0distance
abs.
err
or (k
cal/m
ol)
Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059
16/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Uncertainty estimates are essential for our surrogate model toexplore chemical space:
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l) Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:
var (y∗|x∗) ≈ 1J ∑j yT
j yj + τ−1
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l)
0
10
20
30
0.5 1.0 1.5 2.0distance
abs.
err
or (k
cal/m
ol)
Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059
16/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Uncertainty estimates are essential for our surrogate model toexplore chemical space:
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l) Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:
var (y∗|x∗) ≈ 1J ∑j yT
j yj + τ−1
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l)
0
10
20
30
0.5 1.0 1.5 2.0distance
abs.
err
or (k
cal/m
ol)
Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059
16/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Model transferability
Uncertainty estimates are essential for our surrogate model toexplore chemical space:
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l)
Uncertainty from mc-dropout1:ANN model approximates vari-ational inference with GP undersome conditions:
var (y∗|x∗) ≈ 1J ∑j yT
j yj + τ−1
-50
-25
0
25
50
75
-50 -25 0 25 50surrogate splitting (kcal/mol)
DF
T s
plit
tin
g (
kcal
/mo
l)
0
10
20
30
0.5 1.0 1.5 2.0distance
abs.
err
or (k
cal/m
ol)
Gal, Y. and Ghahramani, Z., 2016. ICMLR 1050-1059
17/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Demonstration
Can we use the ANN model to find new spin-crossover materials,i.e. ∆EH−L = 0?
Define a space of 32 ligands, 5 metals and with∼ 5600 possible elements with forced axial/equatorial symmetry3:
3Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.
17/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Demonstration
Can we use the ANN model to find new spin-crossover materials,i.e. ∆EH−L = 0? Define a space of 32 ligands, 5 metals and with∼ 5600 possible elements with forced axial/equatorial symmetry3:
3Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.
18/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Demonstration
ANN is trained on 14 of these ligands, covers only 2% of thedesign space.
We can visualize the design space using t-SNE4:
−40
−20
0
20
40
0.0
0.5
1.0
1.5
2.0
4Maaten, L., & Hinton, G., 2008. J. Mach. Learn. Res. 2579-2605.
18/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Demonstration
ANN is trained on 14 of these ligands, covers only 2% of thedesign space. We can visualize the design space using t-SNE4:
−40
−20
0
20
40
0.0
0.5
1.0
1.5
2.0
4Maaten, L., & Hinton, G., 2008. J. Mach. Learn. Res. 2579-2605.
18/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Demonstration
ANN is trained on 14 of these ligands, covers only 2% of thedesign space. We can visualize the design space using t-SNE4:
−40
−20
0
20
40
0.0
0.5
1.0
1.5
2.0
4Maaten, L., & Hinton, G., 2008. J. Mach. Learn. Res. 2579-2605.
19/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How accurate are we?
Test 51 leads from ANN with DFT5:
1
2 2
1
2
3 3 3
2
1
3
7 7
4
5
3
1 1
0
2
4
6
8
-20 -15 -10 -5 0 5 10errors (kcal/mol)
coun
t
sub. isocyanides
0
5
10
15
0.00 0.25 0.50 0.75distance to train
ΔE H
− LA
NN−Δ
E H− L
GO
(kca
l/mol
)
23
CrMnFeCo
5Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.
19/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How accurate are we?
Test 51 leads from ANN with DFT5:
1
2 2
1
2
3 3 3
2
1
3
7 7
4
5
3
1 1
0
2
4
6
8
-20 -15 -10 -5 0 5 10errors (kcal/mol)
coun
t
sub. isocyanides
0
5
10
15
0.00 0.25 0.50 0.75distance to train
ΔE H
− LA
NN−Δ
E H− L
GO
(kca
l/mol
)
23
CrMnFeCo
5Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.
19/21
Introduction Describing inorganic complexes Similarity and model uncertainty
How accurate are we?
Test 51 leads from ANN with DFT5:
1
2 2
1
2
3 3 3
2
1
3
7 7
4
5
3
1 1
0
2
4
6
8
-20 -15 -10 -5 0 5 10errors (kcal/mol)
coun
t
sub. isocyanides
0
5
10
15
0.00 0.25 0.50 0.75distance to train
ΔE H
− LA
NN−Δ
E H− L
GO
(kca
l/mol
)
23
CrMnFeCo
5Janet, J.P., Chan, L. and Kulik, H.J. J. Phys. Chem. Lett., 2018, 9, 5,1064-1071.
20/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Conclusions
choice of molecular representation is important
different properties depend non-equally on features
feature-space geometry can provide insight into modelreliability
imbuing ‘chemical intuition’ to descriptor construction candrastically improve learning
conversely, feature selection can contribute tounderstanding systems
21/21
Introduction Describing inorganic complexes Similarity and model uncertainty
Acknowledgments
Thanks to the Kulik group and funding partners: