Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
Navigating through the domains of biology and chemistry
15 May, 2009
James F. Rathman
The Ohio State University
Chihae Yang
US FDA CFSAN OFAS
2
Landscape of ToxCAST data
• 309 unique chemicals– Mostly agrochemicals
• 524 in vitro bioassays– 9 in vitro assay providers
• 285 cell‐based, 239 cell‐free
• 76 in vivo bioassays– ToxRefDB
• Target organs (chronic), reproductive, developmental, carcinogenicity
3
Premise of Structure Activity Relationship
N
NNH2
NH2
X
“There just aren’t that many people interested in Chemistry” – Dave Weininger
Properties
Molecular Structure
Activity
Molecular descriptors
Only if we had that magical set of descriptors....
4
Premise of in vitro predictions
In vitro
Molecular biomarkers
Only if we had that magical set of biomarkers, signatures....
In vivo state
“Biology is incredibly complicated. Why?” – Richard Dawkins
5
Pairwise positive agreement for allin vivo vs. all in vitro assays ( = 0.01)
in vitro assays (524 total)
in v
ivo
assa
ys (7
6 to
tal)
p-value < 0.01p-value < 0.001p-value < 0.0001
39,824 pairs
6
In vivo and BioSeek assay pairs with significant positive agreement (a = 0.01)
in vitro assays(27 had at least one significant result) p-value < 0.01
p-value < 0.001p-value < 0.0001
7
Pairwise agreement between in vitro and in vivo assay results
in vivo0 1
in vitro0 n00 n011 n10 n11
(0 | 0) (1|1)(1| 0) (0 |1)
P PP P
0
1
: 1 (independent): 1 (positive agreement)
HH
where is the odds ratio:
00 11
10 01
ˆ n nn n
Fisher’s exact test: at a given significance level a
• 55 at = 0.001• 336 at = 0.01
Significant pairs:
8
The “Good” News
CHR_Rat_Cholinesterase Inhibition0 1
NVS_ENZ_rAChE0 202 281 4 14
p‐value = 3.8 x 10‐9
concordance = 87%
CHR_Rat_Cholinesterase Inhibition
0 1BSK_hDFCGF
_VCAM10 136 131 70 29 p‐value = 2.8 x 10‐5
concordance = 67%Chem. Res. Toxicol. 2009, 22, 633–638.
O
O OP
O
O
O
O
O
OO
N
40349
40312
PNAS, March 18, 2008, Vol 105 (11), 4295.
9
And the “Not so good news”
• The fraction of in vivo/in vitro assay pairs with statistically significant positive agreement is– 0.0084 at = 0.01
– 0.0014 at = 0.001
• Type I error?
• Approximately half (50.8%) of the assay pairs have a p‐value ≈ 1 due to very few (often zero) actives in one or both assays.
10
Implications ‐ experimental factors
• Bioassays – detection limits and reproducibility...
• 309 Chemicals– some of them look quite reactive...
• ...
• Remove reactive chemicals from analysis – Acyl hydrazide, ‐halo
carbonyl, reactive alkyl halides, halo amine, ‐halo ethers...
• Result: 288 chemicals– Large impact on the statistics of
the agreement pairs
• 30 pairs at = 0.001
• 273 pairs at = 0.01
11
FT
Features dimension as a link
FA
ST SA
Structure‐In vivo assays
Structure‐In vitro assays
In vitro (FA) – in vivo (FT)
Presented in EPA Comp Tox Forum in May 2007
Conceptual world of chemical representation for toxicity predictions
• PhysChem propert
ies
• Calculated des
criptors
• Structural key
s, features
• EPA chemical classes• FDA Redbook categories
•Metabolic & chemical reactivity
•Structural alerts
•OECD categories• DSL groups
Pure structure classifiers
Mode of action classes
Categories/Alerts
supervised
unsupervised
linking layer
13
ToxCAST dataset ‐ “Structural Classifiers”
• Chains– aliphatic, long‐alkyl, alkenyl,
alkynyl, alkyl_c9:c10_alkenyl...
• Rings– aromatic, carbocyclic,
heterocylic, fused (shapes), strained...
• Functional groups– alcohol, amine, carboxylic acid,
halide_alkyl, halide_aromatic...
• Coordination chemistry – chelating ligands, metal
environments
• To expand the structural categories defined in US FDA Redbook
• To participate in the chemical ontology movement
• To describe chemicals in FDA and ToxRef databases
14
All in vitro assays against selected “Structural Classifiers”
Structure classifiers
524 in vitro bioassays
Cell‐free
mean: 0‐0.25mean: 0.25‐0.5mean: 0.5‐0.75mean: 0.75‐1.0
Cell‐based
15
All 76 in vivo assays against the selected “Structural Classifiers”
A lc ohol_a lk eny l_c y c lic a lk y lA lc ohol_a lk y lA lc ohol_c y c lic a lk y lA lk aneA lk ane_c y c licA lk ane_c y c lohex y lA lk ane_t-buty lA lk eneA lk ene_c y c lic _c y c lohex eneA lk ene_c y c lic _c y c lopenteneA m ine_heteroc y c licA m ine_p-am ine_arom at icA m ine_s -am ine_arom at icA m ine_t-am ine_arom at icA m inoc arbony lB enz ene_alk y l t -buty lCarbam ateCarbam ate_th ioc arbam ateCarbony lCarbox am ideCarbox am ide_arom aticCarbox y lateCarbox y late_a liphat ic es terCarbox y late_arom at icCarbox y late_phtha late es ter_a lk y lCarbox y lic ac idCarbox y lic ac id_arom aticE ther_alk y l_arom aticE ther_s p iroF us ed ring_heteroc y c le_[5 ,6 ]_N O SHalide_c h loro gem _alk y lHa lide_c h loro po ly _arom at icHalide_c h loro_arom aticHalide_f luoro_arom at icHalide_f luoro_benz y lHalide_tric h lorom ethy lHalide_trif luorom ethy lHeteroc y c le_ im idaz o lid ineHeteroc y c le_ im idaz o lid ine ox oHeteroc y c le_ox olaneHeteroc y c le_py raz oleHeteroc y c le_py rid ineHeteroc y c le_py rim id ine ox oHeteroc y c le_py rro lid ine ox oHeteroc y c le_th iad iaz oleHeteroc y c le_triaz ineHeteroc y c le_triaz ine ox oHeteroc y c le_triaz o leHeteroc y c lic ring_[6]_N O SHy drox y lam ineIm inom ethy lK etone_c y c lic a lk eny lM ethane_dipheny lN itrileN itrile_arom at icN itro_arom aticO rganom etalP hos phorus _organ icP hos phorus _organ ic _th ioP oly phenolS tra ined ring_c y c lopropy lS u lf ideS ulfonam ideUrea
CH
R_M
ouse
_Kid
neyP
atho
logy
CH
R_M
ouse
_Liv
erH
yper
troph
yC
HR
_Mou
se_L
iver
Nec
rosi
sC
HR
_Mou
se_L
iver
Pro
lifer
ativ
eLes
ions
CH
R_M
ouse
_Liv
erTu
mor
sC
HR
_Mou
se_L
ungT
umor
sC
HR
_Mou
se_T
umor
igen
CH
R_R
at_C
holin
este
rase
Inhi
bitio
nC
HR
_Rat
_Kid
neyN
ephr
opat
hyC
HR
_Rat
_Kid
neyP
rolif
erat
iveL
esio
nsC
HR
_Rat
_Liv
erH
yper
troph
yC
HR
_Rat
_Liv
erN
ecro
sis
CH
R_R
at_L
iver
Pro
lifer
ativ
eLes
ions
CH
R_R
at_L
iver
Tum
ors
CH
R_R
at_S
plee
nPat
holo
gyC
HR
_Rat
_Tes
ticul
arA
troph
yC
HR
_Rat
_Tes
ticul
arTu
mor
sC
HR
_Rat
_Thy
roid
Hyp
erpl
asia
CH
R_R
at_T
hyro
idPr
olife
rativ
eLes
ions
CH
R_R
at_T
hyro
idTu
mor
sC
HR
_Rat
_Tum
orig
enD
EV_R
abbi
t_C
ardi
ovas
cula
r_H
eart
DEV
_Rab
bit_
Car
diov
ascu
lar_
Maj
orV
esse
lsD
EV
_Rab
bit_
Gen
eral
_Fet
alW
eigh
tRed
uctio
nD
EV
_Rab
bit_
Gen
eral
_Gen
eral
Feta
lPat
holo
gyD
EV
_Rab
bit_
Neu
rose
nsor
y_B
rain
DE
V_R
abbi
t_N
euro
sens
ory_
Eye
DEV
_Rab
bit_
Oro
faci
al_C
left
LipP
alat
eD
EV
_Rab
bit_
Oro
faci
al_J
awH
yoid
DEV
_Rab
bit_
Pre
gnan
cyR
elat
ed_E
mbr
yoFe
talL
oss
DE
V_R
abbi
t_P
regn
ancy
Rel
ated
_Mat
erna
lPre
gLos
sD
EV
_Rab
bit_
Ske
leta
l_A
ppen
dicu
lar
DEV
_Rab
bit_
Skel
etal
_Axi
alD
EV_R
abbi
t_S
kele
tal_
Cra
nial
DEV
_Rab
bit_
Trun
k_B
odyW
all
DE
V_R
abbi
t_Tr
unk_
Spl
anch
nicV
isce
raD
EV
_Rab
bit_
Uro
geni
tal_
Gen
ital
DE
V_R
abbi
t_U
roge
nita
l_R
enal
DE
V_R
abbi
t_U
roge
nita
l_U
rete
ricD
EV_R
at_C
ardi
ovas
cula
r_H
eart
DEV
_Rat
_Car
diov
ascu
lar_
Maj
orVe
ssel
sD
EV
_Rat
_Gen
eral
_Fet
alW
eigh
tRed
uctio
nD
EV_R
at_G
ener
al_G
ener
alFe
talP
atho
logy
DE
V_R
at_N
euro
sens
ory_
Bra
inD
EV_R
at_N
euro
sens
ory_
Eye
DEV
_Rat
_Oro
faci
al_C
left
LipP
alat
eD
EV
_Rat
_Oro
faci
al_J
awH
yoid
DE
V_R
at_P
regn
ancy
Rel
ated
_Em
bryo
Feta
lLos
sD
EV
_Rat
_Pre
gnan
cyR
elat
ed_M
ater
nalP
regL
oss
DE
V_R
at_S
kele
tal_
App
endi
cula
rD
EV
_Rat
_Ske
leta
l_A
xial
DE
V_R
at_S
kele
tal_
Cra
nial
DE
V_R
at_T
runk
_Bod
yWal
lD
EV
_Rat
_Tru
nk_S
plan
chni
cVis
cera
DEV
_Rat
_Uro
geni
tal_
Gen
ital
DE
V_R
at_U
roge
nita
l_R
enal
DEV
_Rat
_Uro
geni
tal_
Ure
teric
MG
R_R
at_A
dren
alM
GR
_Rat
_Epi
didy
mis
MG
R_R
at_F
ertil
ityM
GR
_Rat
_Ges
tatio
nalIn
terv
alM
GR
_Rat
_Im
plan
tatio
nsM
GR
_Rat
_Kid
ney
MG
R_R
at_L
acta
tionP
ND
21M
GR
_Rat
_Litt
erSi
zeM
GR
_Rat
_Liv
eBirt
hPN
D1
MG
R_R
at_L
iver
MG
R_R
at_M
atin
gM
GR
_Rat
_Ova
ryM
GR
_Rat
_Pitu
itary
MG
R_R
at_P
rost
ate
MG
R_R
at_S
plee
nM
GR
_Rat
_Tes
tisM
GR
_Rat
_Thy
roid
MG
R_R
at_U
teru
sM
GR
_Rat
_Via
bilit
yPN
D4 A lc ohol_a lk eny l_c y c lic a lk y l
A lc ohol_a lk y lA lc ohol_c y c lic a lk y lA lk aneA lk ane_c y c licA lk ane_c y c lohex y lA lk ane_t-buty lA lk eneA lk ene_c y c lic _c y c lohex eneA lk ene_c y c lic _c y c lopenteneA m ine_heteroc y c licA m ine_p-am ine_arom at icA m ine_s -am ine_arom at icA m ine_t-am ine_arom at icA m inoc arbony lB enz ene_alk y l t -buty lCarbam ateCarbam ate_th ioc arbam ateCarbony lCarbox am ideCarbox am ide_arom aticCarbox y lateCarbox y late_a liphat ic es terCarbox y late_arom at icCarbox y late_phtha late es ter_a lk y lCarbox y lic ac idCarbox y lic ac id_arom aticE ther_alk y l_arom aticE ther_s p iroF us ed ring_heteroc y c le_[5 ,6 ]_N O SHalide_c h loro gem _alk y lHa lide_c h loro po ly _arom at icHalide_c h loro_arom aticHalide_f luoro_arom at icHalide_f luoro_benz y lHalide_tric h lorom ethy lHalide_trif luorom ethy lHeteroc y c le_ im idaz o lid ineHeteroc y c le_ im idaz o lid ine ox oHeteroc y c le_ox olaneHeteroc y c le_py raz oleHeteroc y c le_py rid ineHeteroc y c le_py rim id ine ox oHeteroc y c le_py rro lid ine ox oHeteroc y c le_th iad iaz oleHeteroc y c le_triaz ineHeteroc y c le_triaz ine ox oHeteroc y c le_triaz o leHeteroc y c lic ring_[6]_N O SHy drox y lam ineIm inom ethy lK etone_c y c lic a lk eny lM ethane_dipheny lN itrileN itrile_arom at icN itro_arom aticO rganom etalP hos phorus _organ icP hos phorus _organ ic _th ioP oly phenolS tra ined ring_c y c lopropy lS u lf ideS ulfonam ideUrea
CH
R_M
ouse
_Kid
neyP
atho
logy
CH
R_M
ouse
_Liv
erH
yper
troph
yC
HR
_Mou
se_L
iver
Nec
rosi
sC
HR
_Mou
se_L
iver
Pro
lifer
ativ
eLes
ions
CH
R_M
ouse
_Liv
erTu
mor
sC
HR
_Mou
se_L
ungT
umor
sC
HR
_Mou
se_T
umor
igen
CH
R_R
at_C
holin
este
rase
Inhi
bitio
nC
HR
_Rat
_Kid
neyN
ephr
opat
hyC
HR
_Rat
_Kid
neyP
rolif
erat
iveL
esio
nsC
HR
_Rat
_Liv
erH
yper
troph
yC
HR
_Rat
_Liv
erN
ecro
sis
CH
R_R
at_L
iver
Pro
lifer
ativ
eLes
ions
CH
R_R
at_L
iver
Tum
ors
CH
R_R
at_S
plee
nPat
holo
gyC
HR
_Rat
_Tes
ticul
arA
troph
yC
HR
_Rat
_Tes
ticul
arTu
mor
sC
HR
_Rat
_Thy
roid
Hyp
erpl
asia
CH
R_R
at_T
hyro
idPr
olife
rativ
eLes
ions
CH
R_R
at_T
hyro
idTu
mor
sC
HR
_Rat
_Tum
orig
enD
EV_R
abbi
t_C
ardi
ovas
cula
r_H
eart
DEV
_Rab
bit_
Car
diov
ascu
lar_
Maj
orV
esse
lsD
EV
_Rab
bit_
Gen
eral
_Fet
alW
eigh
tRed
uctio
nD
EV
_Rab
bit_
Gen
eral
_Gen
eral
Feta
lPat
holo
gyD
EV
_Rab
bit_
Neu
rose
nsor
y_B
rain
DE
V_R
abbi
t_N
euro
sens
ory_
Eye
DEV
_Rab
bit_
Oro
faci
al_C
left
LipP
alat
eD
EV
_Rab
bit_
Oro
faci
al_J
awH
yoid
DEV
_Rab
bit_
Pre
gnan
cyR
elat
ed_E
mbr
yoFe
talL
oss
DE
V_R
abbi
t_P
regn
ancy
Rel
ated
_Mat
erna
lPre
gLos
sD
EV
_Rab
bit_
Ske
leta
l_A
ppen
dicu
lar
DEV
_Rab
bit_
Skel
etal
_Axi
alD
EV_R
abbi
t_S
kele
tal_
Cra
nial
DEV
_Rab
bit_
Trun
k_B
odyW
all
DE
V_R
abbi
t_Tr
unk_
Spl
anch
nicV
isce
raD
EV
_Rab
bit_
Uro
geni
tal_
Gen
ital
DE
V_R
abbi
t_U
roge
nita
l_R
enal
DE
V_R
abbi
t_U
roge
nita
l_U
rete
ricD
EV_R
at_C
ardi
ovas
cula
r_H
eart
DEV
_Rat
_Car
diov
ascu
lar_
Maj
orVe
ssel
sD
EV
_Rat
_Gen
eral
_Fet
alW
eigh
tRed
uctio
nD
EV_R
at_G
ener
al_G
ener
alFe
talP
atho
logy
DE
V_R
at_N
euro
sens
ory_
Bra
inD
EV_R
at_N
euro
sens
ory_
Eye
DEV
_Rat
_Oro
faci
al_C
left
LipP
alat
eD
EV
_Rat
_Oro
faci
al_J
awH
yoid
DE
V_R
at_P
regn
ancy
Rel
ated
_Em
bryo
Feta
lLos
sD
EV
_Rat
_Pre
gnan
cyR
elat
ed_M
ater
nalP
regL
oss
DE
V_R
at_S
kele
tal_
App
endi
cula
rD
EV
_Rat
_Ske
leta
l_A
xial
DE
V_R
at_S
kele
tal_
Cra
nial
DE
V_R
at_T
runk
_Bod
yWal
lD
EV
_Rat
_Tru
nk_S
plan
chni
cVis
cera
DEV
_Rat
_Uro
geni
tal_
Gen
ital
DE
V_R
at_U
roge
nita
l_R
enal
DEV
_Rat
_Uro
geni
tal_
Ure
teric
MG
R_R
at_A
dren
alM
GR
_Rat
_Epi
didy
mis
MG
R_R
at_F
ertil
ityM
GR
_Rat
_Ges
tatio
nalIn
terv
alM
GR
_Rat
_Im
plan
tatio
nsM
GR
_Rat
_Kid
ney
MG
R_R
at_L
acta
tionP
ND
21M
GR
_Rat
_Litt
erSi
zeM
GR
_Rat
_Liv
eBirt
hPN
D1
MG
R_R
at_L
iver
MG
R_R
at_M
atin
gM
GR
_Rat
_Ova
ryM
GR
_Rat
_Pitu
itary
MG
R_R
at_P
rost
ate
MG
R_R
at_S
plee
nM
GR
_Rat
_Tes
tisM
GR
_Rat
_Thy
roid
MG
R_R
at_U
teru
sM
GR
_Rat
_Via
bilit
yPN
D4
mean: 0‐0.25mean: 0.25‐0.5mean: 0.5‐0.75mean: 0.75‐1.0
Structure classifiers
76 in vivo bioassays
16
Tumorigenic and proliferative lesions –liver and thyroid
• Due to TSH mechanism, thyroid tumors are usually correlated with liver lesions.
• Multivariate plots using structural classes again show the relationship.
Used all 243 structural classifiers
Feature level:
Compound level:Sensitivity = 39 %Specificity = 90 %
Current Computer‐Aided Drug Design, 2006, 2, 135‐150
liver
thyroid
17
-0.50
0.51
1.52
-2-1
0
12
-1
-0.5
0
0.5
1
1.5
2
p2p3
p 4
PC projections of chemicals using the “StructureClassifiers”
CHR_Rat_ThyroidTumors
-0.50
0.51
1.52
-2-1
0
12
-1
-0.5
0
0.5
1
1.5
2
p2p3
p 4
CHR_Rat_LiverTumors
40540
230 chemicals with both liver and thyroid tumors are plotted.
O
O
O
40564
40540 N
N
O
Cl
N
FF
F
S
O
O
O O
40574
Cl
Cl
O
O
O O
40424
4057440564
18
-0.50
0.51
1.52
-2-1
0
12
-1
-0.5
0
0.5
1
1.5
2
p2p3
p 4
-0.50
0.51
1.52
-2-1
0
12
-1
-0.5
0
0.5
1
1.5
2
p2p3
p 4
A rodent bioassay vs. an in vitro genetox assay
GreenScreen (GADD45a )
40540
CHR_Rat_Tumorigen
40440
O
OO
40444N
+
N+
N
O
O
O
OF
F
F
40444
40540
O
NS
Cl
40612
40612
Andrew Knight, Steve Little et. al, manuscript submitted, 2009
19
-0.50
0.51
1.52
-2-1
0
12
-1
-0.5
0
0.5
1
1.5
2
p2p3
p 4
A rodent bioassay vs. an in vitro assay
-0.50
0.51
1.52
-2-1
0
12
-1
-0.5
0
0.5
1
1.5
2
p2p3
p 4
CHR_Rat_Tumorigen CLZD_2B6_6
40444N
+
N+
N
O
O
O
OF
F
F
40488
N
N
N
N O
O
40488
40440
40440
O
OO
40444
40612
O
NS
Cl
40612
O
O
O
ClF
F
F
N+ OO OO
40501
40501
20
Summary
• Finding signatures systematically from a variety of in vitro assays relating to in vivo phenotypic effects maybe possible.– Data mine the 30 significant agreement pairs of in vitro and in vivo assays
– Data mine the correlation of [AT]
[FA]T [FT] = [AT]
[FA]: features vs. in vitro activity[FT]: feature vs. in vivo activity
From the vantage point of a pragmatist
• Are these predictions better than QSAR models and experts’ rules?
– e.g., a weight of evidence model for rat tumorigens (230)• 4 partial logistic regression models based on “structure classifiers”and whole molecule properties are optimized to give a final result.
• sensitivity:75%; specificity:90%; ROC (true positive/false positive):3.1
• Will the bioassays help build mode‐of‐action models?– e.g., aromatic halides for liver necrosis or pyridines for kidney nephropathy
• How practical is it to use bioassays as predictors?
22
Integrated Testing Strategies
• How do we apply what we learned from ToxCAST analysis?– Our perpetual wish list: better assays and better descriptors
• How do we integrate actual testing to improve predictions during the prioritization cycle? – select assays and descriptors
– link experiment to predictions/prioritizations