Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Exploring available compound data with the Open PHACTS Discovery Platform and KNIME
252nd ACS National Meeting
Daniela Digles, Gerhard F. Ecker Philadelphia, PA, August, 21, 2016
Pre-competitive Informatics:Pharma are all accessing, processing, storing & re-processing external research data
LiteraturePubChem
GenbankPatents
DatabasesDownloads
Data Integration Data AnalysisFirewalled Databases
Repeat @
each
company
x
Lowering industry firewalls: pre-competitive informatics in drug discovery
Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
Different concept types
@gray_alasdair Big Data Integration 4
Nanopub
Db
VoID
Data Cache (Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
Co
re P
latf
orm
P12374
EC2.43.4
CS4532
“Adenosine
receptor
2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps
Workflow tools
9Daniela Digles
• Single „blocks“ for each data processing step(e.g. data reader, calculations, visualization, …)
• Blocks are placed via drag-and-drop andconnected to each other with arrows.
• Commercial (e.g. Pipeline Pilot) and free tools(e.g. KNIME) available.
Data inputData
processingData
processing
Data export
View data
KNIME
10Daniela Digles
• KNIME Analytics Platform
• Available from www.knime.org
• Open source data analytics, reporting andintegration platform
• Workflows can be built by connecting „Nodes“
• Open PHACTS KNIME nodes available fromgithub: https://github.com/openphacts/OPS-Knime
Open PHACTS KNIME
11Daniela Digles
executable API call
Swagger
12Daniela Digles
Structured format for the generation of API documentation.https://dev.openphacts.org/swagger/spec/ops_1_5.json
(….)
Open PHACTS KNIME
14Daniela Digles
Open PHACTS KNIME
15Daniela Digles
Obtaining the results
16Daniela Digles
Answering “scientific competency questions”
17Daniela Digles
• 20 questions defined at the beginning of theproject.
• Example: Give me all oxidoreductase inhibitors active <100 nM in human and mouse.
• Many questions need a combination of queriesto the Open PHACTS Platform.
Questions: Azzaoui K et al. (2013) Drug Discov. Today 18: 843 – 852.
Workflows: Chichester C et al. (2015) Drug Discov. Today 20: 399 – 405.
Example workflow
18Daniela Digles
Q10: For a given compound, summarize all similar compounds and their activities
CC1=C(C(C(=C(N1)C)C(=O)OC)C2=CC=CC=C2[N+](=O)[O-])C(=O)OC
Workflow to collect compound data
19Daniela Digles
• Data for retrieved molecules:• Function and toxicity annotation (Drugbank)
• Role of the molecule (ChEBI)
• Pharmacology data, activity < 10 µM (ChEMBL)
• Patent data (SureChEMBL)
• Data for retrieved targets:• Pathways (WikiPathways)
• Diseases (DisGeNET)
• Example: propafenone derivative
Workflow to collect compound data
20Daniela Digles
Collected results:Compound information
21Daniela Digles
• Structure search: 96 molecules, including themolecule itself.
• Compound information/classification: 1 knowndrug propafenone
Collected results:Patent information
22Daniela Digles
• Highest confidence score:• Patents found for 3 molecules
• No patents found for the original structure
• Lower confidence score:• Patents found for 8 molecules
• 2 patents found for the original structure (High throughput assay for discovering new inhibitors of the GIRK1/4 channel)
• Restriction: Markush structures are not enumerated in SureChEMBL
Collected results:Bioactivity values
23Daniela Digles
191 activity values (lower than 10 µM) against 33 targets.
P-glycoprotein
Cells expressing P-glycoprotein
Propafenone
Collected results:Targets
24Daniela Digles
Target classifications (per compound)
Target classifications (unique targets)
Collected results:Pathways for targets
25Daniela Digles
• 98 Pathways in total
• 4 pathways contain > 5 of the identified targets• GPCR downstream
signalling
• GPCR ligand binding
• Relevance of the pathways?
Collected results:Diseases for targets
26Daniela Digles
> 2000 diseases in 25 disease classes
• Workflow allows the easy preparation of a first overview on known data for a compound of interest.
• New ideas for targets to test the compounds against. • Example: Serotonin receptor for propafenone
derivatives
• Literature (Pubmed) is returned for the results.
• Additional external or in-house data can be added.
• Methods for prioritization needed:• Relevance of pathways
• Relevance of diseases
Daniela Digles 27
Conclusions
Useful links
28Daniela Digles
Open PHACTS: http://www.openphactsfoundation.org/
API: https://dev.openphacts.org/
Support portal: http://support.openphacts.org/
Example Workflows: http://www.myexperiment.org/groups/1125.html
Presentations on YouTube:
https://www.youtube.com/user/OpenPHACTS
For help or feedback: [email protected]
Acknowledgements
Pharmacoinformatics research group, University of Vienna
– Gerhard F. Ecker
– Barbara Zdrazil
– Jana Gurinova
Open PHACTS – KNIME
– Ronald Siebes, VU Amsterdam
– Christine Chichester, SIB
– Evan Tzanis, QMUL
Daniela Digles