Upload
john-overington
View
217
Download
0
Embed Size (px)
DESCRIPTION
Slides from a tour of a number of US labs in December 2014
Citation preview
ChEMBL: Resources for Drug Discovery
John P. Overington @johnpoverington [email protected]
EMBL-‐EBI
ChEMBL Strategy
• Comprehensively catalogue historical drug discovery • Include successes and failures
• Large scale abstracHon curaHon of primary literature • Direct deposiHons
• Drugs can be small molecules, pepHdes, recombinant proteins, siRNA, cells, viruses, etc.
• ‘Learn’ rules for drug discovery ‘success’ • Target selecHon and prioriHsaHon -‐ druggability • Lead discovery, opHmisaHon, clinical candidate selecHon • Develop approaches to new target classes – e.g. PPIs
• Make all data freely available to enHre community • Encourage re-‐use, integraHon and cross-‐linking
Target Discovery
Lead Discovery Lead OpHmisaHon
Preclinical Development
Phase 1 Phase 2 Phase 3 Launch (Phase 4)
Drug Discovery
>1,638,000 compound records >12,800,000 bioacHviHes ~57,150 abstracted papers ~10,579 targets
~12,000 clinical candidates
~1,600 drugs
• Target idenHficaHon • Microarray profiling • Target validaHon • Assay development • Biochemistry • Clinical/Animal disease models
• High-‐throughput Screening (HTS) • Fragment-‐based screening • Focused libraries • Screening collecHon
• Medicinal Chemistry • Structure-‐based drug design • SelecHvity screens • ADMET screens • Cellular/Animal disease models • PharmacokineHcs
• Toxicology • In vivo safety pharmacology • FormulaHon • Dose predicHon
PK tolerability Efficacy
Safety & Efficacy
IndicaHon discovery, repurposing & expansion
Med. Chem. SAR Clinical Candidates Drugs
Discovery Development Use
ChEMBL content ChEMBL19 content
4th generaHon 3rd generaHon 2nd generaHon 1st generaHon Prototype
N
O
N
O
O
H
NN
N
Cl Cl
NN
N
O
N
O
N
O
O
H
NN
N
Cl Cl
N
O
N
O
O
O
H
N
N
Cl Cl
Drug Optimisation
N
N
N+
O
O
Azomycin (1956)
Streptomyces natural product trichomonacidal ‘toxic’
Metronidazole 1962
N
N
N+
O
O
O
N
N
Cl
N
N
Cl
Cl
O
Cl
Cl
N
N
Cl
Cl
O
Cl
Clotrimazole 1970
Miconazole 1970
Econazole 1972
N
N
Cl
Cl
S
Cl
N
N
N+
O
O
SO O
N
N
Tinidazole 1970
Bifonazole 1981
Sulconazole 1980
Ketoconazole 1978 Itraconazole 1984
Terconazole 1980
Voriconazole 2002
N N
F
F
OH
N
N
N
F
Fluconazole 1988
OH
N
N
NN
NN
F
F
Fosfluconazole 2004
O
O
NN
NN
N
F F
NN
N
O
OH
Posaconazole 2005
triazole Imidazole
O
N
N
NN
NN
F
F
PO
OHOH
N
N
N
NN
After W. Sneader
Overview of EMBL-‐EBI Chemistry Resources
UniChem – InChI-‐based resolver (full + relaxed ‘lenses’)
3rd Party Data
ZINC, PubChem, ThomsonPharma DOTF, IUPHAR, DrugBank, KEGG,
NIH NCC, eMolecules, FDA SRS, PharmGKB,
Selleck, ….
ChEMBL
BioacHvity data from literature
and deposiHons
ChEBI
Structures and metadata for metabolites. Chemical Ontology
Atlas
Ligand-‐induced transcript response
PDBe
Ligand structures
from structurally defined protein
complexes
SureChEMBL
Ligand structures from patent literature
RDF and REST API interfaces
REST API Interface
15K 750 >15M 1.5M 40K
~75M
ChEMBL
What Is the ChEMBL Data?
SAR Data
Compound
Assay
Ki=4.5 nM
>Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEADCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE
ED2=230 nM
What Is the ChEMBL Data?
Inhibition of human Thrombin
PTT (partial thromboplastin time)
ChEMBL Target Types
Protein complex
e.g. NicoHnic acetylcholine receptor e.g. Muscarinic receptors e.g. DNA
e.g. Mitochondria
e.g. Trachea e.g. HEK293 cells e.g. Drosophila
e.g. PDE5
Protein Nucleic Acid Protein family
Cell line Tissue
Sub-‐cellular fracHon
Organism
ChEMBL
• The world’s largest primary public database of medicinal chemistry data – ~1.6 million compounds, ~10,000 targets, ~12 million bioacHviHes
• Truly Open Data -‐ CC-‐BY-‐SA license
• ChEMBL data also loaded into BindingDB, PubChem BioAssay and BARD
hqps://www.ebi.ac.uk/chembl
A. Gaulton et al (2012) Nucleic Acids Research Database Issue. 40 D1100-‐1107
• New Public chemistry patent resource
• ‘Acquired’ SureChem product from Digital Science – AutomaHcally extracted chemical structures from full-‐text patent
– ~15 million chemical structures
– Updated daily – Plan to add molecular target, sequence, disease, animal model, cell-‐line indexing….
SureChEMBL hqps://www.surechembl.org
hqps://www.ebi.ac.uk/chembl
About ChEMBL
Compound View -‐ 1
Compound View -‐2
Compound View – 3
Compound View -‐ 4
Target Search
Browse Targets
Browse Targets -‐ Organism
Browse Drugs
Drugs
Targets of Launched Drugs
Overington et al, Nat. Rev. Drug Disc., 5, pp. 993-‐996 (2006)
Drug Targets and Drugs
Santos et al, unpublished
Different Types of Drugs
Santos et al, unpublished
SyntheHc small molecule
Natural product-‐derived small molecule
Monoclonal anHbody
Other protein
Polymer
PepHde
OligonucleoHde
Oligosaccharide
Inorganic
Other
Other
Drugs Approved 2013 Assigned USANs 2013
Affinity of Drugs for their‘Targets’ Ki, Kd, IC50, EC50, & pA2 endpoints for drugs against their‘efficacy targets’
2 3 4 5 6 7 8 9 10 11 12 0
50
100
150
200
250
300
350
400
Freq
uency
-‐log10 affinity
10mM 1mM 100mM 10mM 1mM 100nM 10nM 1nM 100pM 10pM 1pM
Overington, et al, Nature Rev. Drug Discov. 5 pp. 993-‐996 (2006) Gleeson et al, Nature Rev. Drug Discov. 10 pp. 197-‐208 (2011)
Privileged Target Families Rhodopsin-‐like GPCR
PDBe: 3sn6 Ion channels PDBe: 4kfm
Nuclear receptors PDBe: 3e00
Protein kinases PDBe: 4foc
22% of drug targets 33% of small mol drugs
12% of drug targets 18% of small mol drugs
6% of drug targets 17% of small mol drugs
13% of drug targets 2.4% of small mol drugs
Over 53% of all targets and 70% of drugs modulate these four target classes
Santos, unpublished
Privileged Target Families ChEMBL17 Drugs
NFκB Pathway
FDA Approved Drugs
Clinical Candidates
Clinical Candidates
Clinical Candidates • Database of clinical development candidates
– Contains ~12,000 2-‐D structures/sequences • EsHmated size ~35-‐45,000 compounds
– Work in progress • Deeper coverage of key gene families • e.g. Protein kinases, 399 disHnct clinical candidates
Pharma Industry ProducHvity File RegistraHon number vs. USAN date
0
100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
Phase 2b date
~Discovery date
Overington, unpublished
Clinical Kinome
Overington, Al-‐Lazikani & Wennerberg, unpublished
Kinase Inhibitors in Clinical Development
Overington, Bellis, Al-‐Lazikani & Wennerberg, unpublished
Clinical Kinome
• 399 Clinical stage human small molecule protein kinase inhibitors – 29 Approved small molecule kinase inhibitors – 38 Phase 3 – 143 Phase 2 – 189 Phase 1
• Phase 1:2 raHo is atypical due to many kinase inhibitor trials being phase 1/2 oncology trials
• 2D structures for 311 of these
Kinase Inhibitor Polypharmacology
US launched
TofaciHnib TozaserHb (Ph. II)
LapaHnib GefiHnib ErloHnib
Staurosporine (no trials)
SuniHnib Sorafenib ImaHnib DasaHnib
Adapted from Ghoreschi et al, Nature Immunology 10, 356 -‐ 360 (2009)
GSK PKIS Data
ChEMBL – Assay Reliability
F.A. Krüger & J.P. Overington (2012) ‘Global analysis of small molecule binding to related protein targets’ PLoS Comp. Biol. 8, e1002333
44
Differences Between Human And Rat Orthologs
Distribution of affinity differences
Human vs Rat
pKd Human
pKd R
at
-‐log(Kd) Human
density
|human pKd -‐ rat pKd|
45
Differences Between Different Assays
Distribution of inter-assay affinity differences
density
Binding affinity in human and rat assays
pKd Assay1
pKd A
ssay2
|human pKd -‐ human pKd|
Density distributions of ortholog and inter-assay differences
pKii -‐ pKij
density
Ortholog vs Intra-assay Differences
Krüger, PLoS Comp. Biol. 8, e1002333, DOI:10.1371/journal.pcbi.1002333
ChEMBL – Domain AnnotaHon
Domain-‐level AnnotaHon
• Site of binding is important in understanding and controlling function • often several sites within same target protein
• Recently annotated binding sites (where possible) for entire ChEMBL target dictionary • used Pfam domains http://www.pfam.org
Domain ‘poisoning’ of sequence queries
Krüger BMC Bioinformatics, 13, S11 DOI:10.1186/1471-2105-13-S17-S11
Kinase SYK (Q64725), R. norvegicus
Phosphatase SH-‐PTP2 (P35235) , R. norvegicus
Domain-‐level Binding Sites Depleted and Enriched Pfam Domains Neur_chan_memb -1.63 zf-C4 -0.94 ANF_receptor -0.88 SH2 -0.83 Pkinase_C -0.70 fn3 -0.53 SH3_1 -0.51 Lig_chan -0.50 C2 -0.50 C1_1 -0.50 Guanylate_cyc -0.46 HATPase_c -0.46 I-set -0.44 adh_short -0.39 PH -0.39 Ank -0.39 ….. Metallophos 0.35 Phospholip_A2_1 0.38 Peptidase_M10 0.41 Asp 0.45 SNF 0.48 Hist_deacetyl 0.48 Carb_anhydrase 0.50 Peptidase_C1 0.51 Trypsin 0.51 Beta-lactamase 0.57 p450 1.00 Hormone_recep 1.19 Ion_trans 1.66 Neur_chan_LBD 2.02 Pkinase_Tyr 2.12 Pkinase 5.87 7tm_1 7.30
Krueger and Overington, unpublished
Binding Between Multiple Domains IdenHfied only 12 mulH-‐domain architectures (corresponding to 120 ChEMBL targets) with ligand binding mediated via more than one domain.
PDBe: 3goi
Krüger BMC Bioinformatics, 13, S11 DOI:10.1186/1471-2105-13-S17-S11
hqps://www.ebi.ac.uk/chembl/research/ppdms
Better prediction of pathway perturbation
Overington, unpublished
Domain specific modulation – mTor
Sirolimus (rapamycin) PI-103
HEAT repeat FAT FRB kinase RD FATC
r
Gable
Rictor mSIN1 MLST8
Raptor
Tel2 FBXW7
DEPTOR
FKBP-12
mSLT8 FKBP-38 Rheb
S6K1
Overington, unpublished
PRAS40 DEPTOR
mTORC1 mTORC2
FKBP-12 binding
mTORC binding
Immunosuppression, Cancer Cancer
Acknowledgements ChEMBL Database Anne Hersey Anna Gaulton Mark Davies Michal Nowotka George Papadatos Jon Chambers Louisa Bellis Rita Santos Gerard Van Westen Ruth Akhtar Francis Atkinson Patricia Bento Ramesh Donadi John Paul Overington Ins5tute of Cancer Research Bissan Al-‐Lazikani Paul Workman FIMM, Helsinki Krister Wennerberg University of Dundee Andrew Hopkins
hqp://chembl.blogspot.com