Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
www.openrisknet.org
OpenRiskNet: Open e-Infrastructure to Support Data Sharing, Knowledge Integration and in silico Analysis and Modelling in Risk Assessment
Project Number 731075
How to describe OpenRiskNet services and their functionality by
semantic annotationThomas Exner, Edelweiss Connect
www.openrisknet.org
Webinars seriesTopic Date & Time
Past events
Introduction sessions to the OpenRiskNet e-infrastructure
Webinar recordings:● Session 1 (24 Sep 2018)● Session 2 (27 Sep 2018)● Session 3 (4 Oct 2018)● Session 4 (30 Oct 2018)
Learn how to deploy the OpenRiskNet virtual research environment Webinar recordings (25 Feb 2019)
Demonstration on data curation and creation of pre-reasoned datasets in the OpenRiskNet framework Webinar recordings (18 Mar 2019)
Identification and linking of data related to AOPWiki (an OpenRiskNet case study) Webinar recordings (26 March 2019)
The Adverse Outcome Pathway Database (AOP-DB) Webinar recordings (8 April 2019)
Current event How to describe OpenRiskNet services and their functionality by semantic annotation
Monday, 13 May 2019, 16:00 CESThttps://openrisknet.org/events/64/
Future events
Use of Nextflow tool for toxicogenomics-based prediction and mechanism identification in OpenRiskNet e-infrastructure
Monday, 27 May 2019, 16:00 CESTRegistration: https://openrisknet.org/events/65/
Case study demo: ModelRX Tuesday, 11 June 2019, 16:00 CEST
Application deployment Date to be announced
Additional demo sessions on case studies June-July 2019 (to be announced)
https://openrisknet.org/events/
www.openrisknet.org
Outline
1. Introduction to the semantic annotation concept in OpenRiskNet2. Short introduction to ontologies and RDF3. Goal 1: Make OpenRiskNet services identify themselves4. Goal 2: Description of the capacities of a service5. Goal 3: Easy access to additional information
www.openrisknet.org
Main components of OpenRiskNet
1. Case-study-driven development - examples of tools to be integrated are selected based on the case study needs
2. Information on case studies in the areas of chemical and nanomaterial risk assessment can be found at https://openrisknet.org/development/case-studies/
3. Solutions for all areas by integrating existing tools from consortium and associated partners
4. Integrated approach combining experimental data (in vivo, in vitro, in chemico) with analysis, modelling and simulation tools to workflows for exposure, hazard and risk assessment
5. Early testing by all stakeholder groups
www.openrisknet.org
Case studies based on risk assessment framework
Berggren et al., 2017
www.openrisknet.org
Main components of OpenRiskNet - technology1. REST services providing data and processing/analysis/modelling tools (provided by
OpenRiskNet and associated partners)
2. Concept to harmonize APIs in an bottom-up approach is available at https://openrisknet.org/development/api-concept/
3. No strict standards but communication through semantic interoperability layer, which provides information on the usage of the data and software including human- and computer readable input/output annotations.
4. Microservice architecture based on containerization and container orchestration accompanied by a discovery service
5. Virtual infrastructures, which can be deployed on public or in-house clouds - reference environment available at .https://home.prod.openrisknet.org
www.openrisknet.org
What do we want to achieve?1. OpenRiskNet services have to be identified as such e.g. for listing in the
OpenRiskNet catalogue and as being accessible on a virtual research infrastructure
2. Provide information on the service in human- and machine-readable form:
a. Capabilities of the service (for databases: used data schema, i.e. the description of the stored data and the associated metadata, as well as search options; for software tools: the type and amount of generated output including the options and parameters)
b. Requirement on input data types and formats
3. Make access to additional information easy:
a. Scientific background of the service, this can just be a link to the relevant publication but also to manuals, tutorials and other training material
b. Technical background like links to source code, installation instructions and deployment options
www.openrisknet.org
Our solution:Fusion of two as-yet largely unrelated technologies: OpenAPI and JSON-LD, where the later opens up the world of ontologies, semantic web and Resource Description Framework (RDF) triplestores
OpenRiskNet services are REST web services described using OpenAPI 3.0 (Swagger) descriptors, annotation with ontological definitions defined using JSON-LD.
● Open API 3.0 Specification
● JSON-LD
D2.2 and D2.4 reports to the EU describing the thinking behind the choice of OpenAPI and Json-LD.
www.openrisknet.org
Helpful tools
www.openrisknet.org
First questions to audience:1. Are you familiar with web services descriptions using swagger /
OpenAPI? (yes/no)2. Are you aware of the concept of annotation with ontologies? (yes/no)3. Did you ever semantically annotate data or other things? (yes/no)
www.openrisknet.org
Short intro to ontologies and semantic annotation
www.openrisknet.org
Short intro to ontologies:
Evolved from the interoperability requirement for system integration• data heterogeneity• semantic heterogeneity
An ontology is a data model that represents a domain and is used to reason about the objects in that domain and the relations between them
An ontology is a (partial) specification of a shared conceptualization, i.e., it is usually a logical theory that expresses the conceptualization explicitly in some language. A conceptualization can be defined as an intensional semantic structure that encodes implicit knowledge constraining the structure of a piece of a domain.
www.openrisknet.org
Short intro to ontologies:
The use of ontologies began in the biological sciences around 1998 with the development of the Gene Ontology (GO). By 2007, there was sufficient interest and activity in the area to merit national and international coordination efforts such as the Open Biomedical Ontologies (OBO) Foundry or the National Center for Biomedical Ontologies. Provided by George Gkoutos, UoB
www.openrisknet.org
Short intro to ontologies:
The backbone of ontology is often a taxonomy. Taxonomy is a classification of things in a hierarchical form. It is usually a tree or a lattice that expresses subsumption relation - i.e., A subsumes B meaning that everything that is in A is also in B. An example is classification of living organisms.
www.openrisknet.org
Short intro to ontologies:
www.openrisknet.org
Short intro to ontologies:
The modular design uses inheritance of ontologies - upper ontologies describe general knowledge, and application ontologies describe knowledge for a particular application
www.openrisknet.org
Short intro to ontologies:
Provided by Egon Willighagen, UM
www.openrisknet.org
Short intro to semantic annotation: Resource Description Framework (RDF)
Resource Description Framework (RDF) is a framework for representing information about resources in a graph form. Information is represented by triples subject-predicate-object.
https://www.obitko.com/tutorials/ontologies-semantic-web/resource-description-framework.html
www.openrisknet.org
RDF triples in JSON-LD
www.openrisknet.org
Swagger / OpenAPI descriptions:
www.openrisknet.org
OpenAPI descriptions as JSON files
www.openrisknet.org
OpenAPI + JSON-LD
Subject or object
Predicate
www.openrisknet.org
Why use ontologies in OpenRiskNet?
● Inputs, outputs and operations that are semantically identical should be understood to be so, despite different spellings etc.
● Ontologies are richer than controlled vocabularies as they allow reasoning (mainly used here for forming Subject-Predicate-Object statements right now)
● Several different ontologies can be used that view related concepts from different points of view
www.openrisknet.org
Second questions to audience:1. Are you interested in a more detailed introduction to ontologies and
semantic web/SPARQL? (yes/no)2. Which form would you prefer? (webinar/f2f/both)
www.openrisknet.org
Goal 1: Make OpenRiskNet services identify themselves
www.openrisknet.org
OpenRiskNet infrastructure components
1. The OpenRiskNet service catalogue listing all compliant service (https://openrisknet.org/e-infrastructure/services/)
2. A discovery service listing all running services on a VRE (e.g. https://home.prod.openrisknet.org)
3. A SPARQL query tester to access the semantic annotation (http://orn-registry-openrisknet-registry.prod.openrisknet.org/)
All these support services need to know, what services are out there.
www.openrisknet.org
Registration of services as simple as possible
More details can be found at: https://github.com/OpenRiskNet/home/wiki/Annotating-API-to-make-it-queryable
No comprehensive ontology available for the service types right now. Collection of needed terms and integrated into one existing ontology ongoing. “x-...” can be used as placeholder.
www.openrisknet.org
Similar for individual API endpoints
www.openrisknet.org
And to make it discoverable in OpenShiftThe OpenRiskNet registry runs in the OpenRiskNet VRE and listens for service and routes starting or stopping. When one starts the registry inspects the OpenAPI definition and adds it to the list of known services. When it stops it is removed from the list.
To be recognised by the ORN registry your service needs to be annotated like this:
More information at: https://github.com/OpenRiskNet/home/wiki/Annotating-your-service-to-make-it-discoverableand in an upcoming webinar on service deployment
www.openrisknet.org
This can already produce nice listings:
http://orn-registry-openrisknet-registry.prod.openrisknet.org/
www.openrisknet.org
Goal 2: Information on capabilities of a service
www.openrisknet.org
Semantic annotation of capabilitiesOpenRiskNet provides information on the service capabilities on two extreme levels:● High level for identifying relevant services● Low level for extracting specific data from API endpoints
www.openrisknet.org
OpenRiskNet extensions:
On the highest level
OpenAPI standard:
OpenTox specifications:
x-orn-@type: "x-orn:Service" ,
x-orn:expects: {
x-orn-@id: "x-orn:OperrationParameters"},
x-orn:returns: {
x-orn-@id: "x-orn:JaqpotModelingTaskId"}
operationId: "trainModel"
- Compound- Feature- Dataset- Algorithm- Model- Validation- Report- Task
www.openrisknet.org
Example: ToxCast dataset
www.openrisknet.org
Finding Edelweiss datasets
www.openrisknet.org
Low level: data schema
www.openrisknet.org
Return values - OpenAPI schemas
www.openrisknet.org
Return values - OpenAPI schemas
www.openrisknet.org
Return values - OpenAPI schemas
www.openrisknet.org
Corresponding data
www.openrisknet.org
Context block
www.openrisknet.org
Becoming more specific: IC50 determined by hill model fitting using the tcpl library
https://github.com/OpenRiskNet/home/wiki/Modelling-IC50-results
www.openrisknet.org
Query tester
http://orn-registry-openrisknet-registry.prod.openrisknet.org/
www.openrisknet.org
Substancesubtree
Figure of full subtree can be found here:https://drive.google.com/open?id=1FSqerIWtZGhRrW2aRej8rdpAQ2X0SDzG
www.openrisknet.org
www.openrisknet.org
www.openrisknet.org
Goal 3: Easy access to additional information
www.openrisknet.org
There is never too much metadata
Proposal: Each OpenRiskNet service is equipped with an additional API endpoint providing metadata.
Standard fields:
● Provider of service● Link to service description● Links to manuals and tutorials● Links to scientific literature● Links to download page of container
www.openrisknet.org
But we can do more!
https://www.epa.gov/sites/production/files/2015-08/documents/toxcast_assay_annotation_data_users_guide_20141021.pdf
www.openrisknet.org
www.openrisknet.org
Conclusion
Goal 1: Identification of OpenRiskNet services
Goal 2: Capacities of a service
● High level description● Annotation of return values
Goal 3: Additional informations
www.openrisknet.org
Acknowledgements
OpenRiskNet (Grant Agreement 731075) is a project funded by the European Commission within Horizon 2020 Programme
Project partners:P1 Edelweiss Connect GmbH, Switzerland (EwC)P2 Johannes Gutenberg-Universität Mainz, Germany (JGU)P3 Fundacio Centre De Regulacio Genomica, Spain (CRG)P4 Universiteit Maastricht, Netherlands (UM)P5 The University Of Birmingham, United Kingdom (UoB)P6 National Technical University Of Athens, Greece (NTUA)P7 Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.,
Germany (Fraunhofer)P8 Uppsala Universitet, Sweden (UU)P9 Medizinische Universität Innsbruck, Austria (MUI)P10 Informatics Matters Limited, United Kingdom (IM)P11 Institut National De L’environnement Et Des Risques INERIS, France (INERIS)P12 Vrije Universiteit Amsterdam, Netherlands (VU)
www.openrisknet.org
Webinars series https://openrisknet.org/events/
Topic Date & Time
Past events
Introduction sessions to the OpenRiskNet e-infrastructure
Webinar recordings:● Session 1 (24 Sep 2018)● Session 2 (27 Sep 2018)● Session 3 (4 Oct 2018)● Session 4 (30 Oct 2018)
Learn how to deploy the OpenRiskNet virtual research environment Webinar recordings (25 Feb 2019)
Demonstration on data curation and creation of pre-reasoned datasets in the OpenRiskNet framework Webinar recordings (18 Mar 2019)
Identification and linking of data related to AOPWiki (an OpenRiskNet case study) Webinar recordings (26 March 2019)
The Adverse Outcome Pathway Database (AOP-DB) Webinar recordings (8 April 2019)
Current event How to describe OpenRiskNet services and their functionality by semantic annotation
Monday, 13 May 2019, 16:00 CESThttps://openrisknet.org/events/64/
Future events
Use of Nextflow tool for toxicogenomics-based prediction and mechanism identification in OpenRiskNet e-infrastructure
Monday, 27 May 2019, 16:00 CESTRegistration: https://openrisknet.org/events/65/
Case study demo: ModelRX Tuesday, 11 June 2019, 16:00 CEST
Application deployment Date to be announced
Additional demo sessions on case studies June-July 2019 (to be announced)