View
245
Download
1
Category
Preview:
Citation preview
Open PHACTS – experience of
sustainability MIOSS 2016nick@openphactsfoundation.org
Openphacts.org
Open PHACTS Mission:
Integrate Multiple Research
Biomedical Data Resources
Into A Single Open & Free
Access Point
…and make it sustainable in the long term
LiteraturePubChem
GenbankPatents
DatabasesDownloads
Data Analysis Data Integration Firewalled Databases
How do pharma companies use public data?
P12047X31045
GB:29384
Andy Law's Third Law“The number of unique identifiers assigned to an individual is never less than the number of Institutions involved in the study”
http://bioinformatics.roslin.ac.uk/lawslaws/
ChEMBL DrugBankGene
OntologyWikipathways
UniProt
ChemSpider
UMLS
ConceptWiki
ChEBI
TrialTrove
GVKBio
GeneGo
TR
Integrity
“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”
“What is the selectivity profile of known p38 inhibitors?”
“Let me compare MW, logP and PSA for known oxidoreductase inhibitors”
DisGeNet
neXtProt
ChEMBL
Target ClassENZYME
FDA adverse
eventsSureChEMBL
Nanopub
Db
VoID
Data Cache (Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
Co
re P
latf
orm
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public Ontologies
User
Annotations
Apps
“CUTTING THE GORDIAN KNOT”
What are the problems with licensing we had to address?
– To make the data and software generated by the project usable and reusable
– Multiplicity of unclear or non-standard licenses on original data sources
• ‘Public’ can mean use but not redistribute, use in commercial environment,
• Legal position on use and reuse extremely unclear
• Different issues than just linking to data
– What is the legal status of integrated collections of the above, and of derived knowledge?
– Appropriate software license selection
– Legal clarity for EFPIA and end users
– Approaches for commercial data integration, EFPIA in-house data
AIM: to enable maximum possible dissemination and usability of the integrated data and
architecture generated by the project - with approaches that will be applicable in other
data integration projects
Licensing Challenges
Dataset Downloaded Version Licence Triples
Bio Assay Ontology CC-By 10,360
CALOHA 8 Apr 2015 2014-01-22 CC-By-ND 14,552
ChEBI 4 Mar 2015 125 CC-By-SA 1,012,056
ChEMBL 18 Feb 2015 20.0 CC-By-SA 445,732,880
ConceptWiki 12 Dec 2013 CC-By-SA 4,331,760
DisGeNET 31 Mar 2015 2.1.0 ODbL 15,011,136
Disease Ontology 2015-05-21 CC-By 188,062
DrugBank 19 Feb 2015 4.1 Non-commercial 4,028,767
ENZYME 2015_11 CC-By-ND 61,467
FDA Adverse Events 9 Jul 2012 CC0 13,557,070
Example Data Licenses
DELIVERY UPDATE
Regular data updates as the core data refreshes
API updates aligned to new business questions and
changes
Workstreams to add further new data – see later
New release May 2016 2.1
– SureChEMBL and Pathways update
Further updates planned for summer 2016
Public
Data
Open PHACTS Evolution - Platform
Public
DataPrivate
Data
Public
Data
VM VM
Public
DataCommercial
Data
• Security Audited
Hosted platform
• Platform sustainability
Sustaining Impact
“Software is free like
puppies are free -
they both need
money for
maintenance”
…and more resource
for future
development
Open PHACTS Foundation Routes to AccessAccess Route Open API
services
Unlimited API
services
Unlimited API,
RDF and Link
sets
Open PHACTS
Virtual Machine
Full OPF
Member ✓ ✓ ✓ ✓
Licensor/
Reseller*✓ ✓ ✓
Licensor (Own
Use)✓ ✓ ✓
High volume
API Licensor✓ ✓
Open Access
API Consumer✓
Open Data Non-
commercial✓ **
*3rd parties must have own agreement with OPF
** talk to us for collaborative proposals – non commercial use
Come and collaborate
New projects
Improve our code and
services
Open Innovation
projects
Webinars
New ideas for data
services and
workflows
info@openphactsfoundation.org @Open_PHACTS
Open PHACTS Practical SemanticsAcknowledgements
GlaxoSmithKline – Coordinator
Universität Wien – Managing entity
Technical University of Denmark
University of Hamburg, Center for
Bioinformatics
BioSolveIT GmBH
Consorci Mar Parc de Salut de Barcelona
Leiden University Medical Centre
Royal Society of Chemistry
Vrije Universiteit Amsterdam
Novartis
Merck Serono
H. Lundbeck A/S
Eli Lilly
Netherlands Bioinformatics Centre
Swiss Institute of Bioinformatics
ConnectedDiscovery
EMBL-European Bioinformatics Institute
Janssen Esteve Almirall
OpenLink Scibite
The Open PHACTS Foundation
Spanish National Cancer Research Centre
University of Manchester
Maastricht University
Aqnowledge
University of Santiago de Compostela
Rheinische Friedrich-Wilhelms-Universität
Bonn
AstraZeneca
Pfizer
Recommended