How to describe OpenRiskNet services and their functionality by … · 2019. 5. 13. · JSON-LD, where the later opens up the world of ontologies, semantic web and Resource Description

www.openrisknet.org

OpenRiskNet: Open e-Infrastructure to Support Data Sharing, Knowledge Integration and in silico Analysis and Modelling in Risk Assessment

Project Number 731075

How to describe OpenRiskNet services and their functionality by

semantic annotationThomas Exner, Edelweiss Connect

www.openrisknet.org

Webinars seriesTopic Date & Time

Past events

Introduction sessions to the OpenRiskNet e-infrastructure

Webinar recordings:● Session 1 (24 Sep 2018)● Session 2 (27 Sep 2018)● Session 3 (4 Oct 2018)● Session 4 (30 Oct 2018)

Learn how to deploy the OpenRiskNet virtual research environment Webinar recordings (25 Feb 2019)

Demonstration on data curation and creation of pre-reasoned datasets in the OpenRiskNet framework Webinar recordings (18 Mar 2019)

Identification and linking of data related to AOPWiki (an OpenRiskNet case study) Webinar recordings (26 March 2019)

The Adverse Outcome Pathway Database (AOP-DB) Webinar recordings (8 April 2019)

Current event How to describe OpenRiskNet services and their functionality by semantic annotation

Monday, 13 May 2019, 16:00 CESThttps://openrisknet.org/events/64/

Future events

Use of Nextflow tool for toxicogenomics-based prediction and mechanism identification in OpenRiskNet e-infrastructure

Monday, 27 May 2019, 16:00 CESTRegistration: https://openrisknet.org/events/65/

Case study demo: ModelRX Tuesday, 11 June 2019, 16:00 CEST

Application deployment Date to be announced

Additional demo sessions on case studies June-July 2019 (to be announced)

https://openrisknet.org/events/

https://openrisknet.org/events/39/











www.openrisknet.org

Outline

1. Introduction to the semantic annotation concept in OpenRiskNet2. Short introduction to ontologies and RDF3. Goal 1: Make OpenRiskNet services identify themselves4. Goal 2: Description of the capacities of a service5. Goal 3: Easy access to additional information

www.openrisknet.org

Main components of OpenRiskNet

1. Case-study-driven development - examples of tools to be integrated are selected based on the case study needs

2. Information on case studies in the areas of chemical and nanomaterial risk assessment can be found at https://openrisknet.org/development/case-studies/

3. Solutions for all areas by integrating existing tools from consortium and associated partners

4. Integrated approach combining experimental data (in vivo, in vitro, in chemico) with analysis, modelling and simulation tools to workflows for exposure, hazard and risk assessment

5. Early testing by all stakeholder groups

https://openrisknet.org/development/case-studies/

www.openrisknet.org

Case studies based on risk assessment framework

Berggren et al., 2017

www.openrisknet.org

Main components of OpenRiskNet - technology1. REST services providing data and processing/analysis/modelling tools (provided by

OpenRiskNet and associated partners)

2. Concept to harmonize APIs in an bottom-up approach is available at https://openrisknet.org/development/api-concept/

3. No strict standards but communication through semantic interoperability layer, which provides information on the usage of the data and software including human- and computer readable input/output annotations.

4. Microservice architecture based on containerization and container orchestration accompanied by a discovery service

5. Virtual infrastructures, which can be deployed on public or in-house clouds - reference environment available at .https://home.prod.openrisknet.org

https://openrisknet.org/development/api-concept/

https://home.prod.openrisknet.org

www.openrisknet.org

What do we want to achieve?1. OpenRiskNet services have to be identified as such e.g. for listing in the

OpenRiskNet catalogue and as being accessible on a virtual research infrastructure

2. Provide information on the service in human- and machine-readable form:

a. Capabilities of the service (for databases: used data schema, i.e. the description of the stored data and the associated metadata, as well as search options; for software tools: the type and amount of generated output including the options and parameters)

b. Requirement on input data types and formats

3. Make access to additional information easy:

a. Scientific background of the service, this can just be a link to the relevant publication but also to manuals, tutorials and other training material

b. Technical background like links to source code, installation instructions and deployment options

www.openrisknet.org

Our solution:Fusion of two as-yet largely unrelated technologies: OpenAPI and JSON-LD, where the later opens up the world of ontologies, semantic web and Resource Description Framework (RDF) triplestores

OpenRiskNet services are REST web services described using OpenAPI 3.0 (Swagger) descriptors, annotation with ontological definitions defined using JSON-LD.

● Open API 3.0 Specification

● JSON-LD

D2.2 and D2.4 reports to the EU describing the thinking behind the choice of OpenAPI and Json-LD.

https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.2.md

https://json-ld.org/

https://doi.org/10.5281/zenodo.1479444

https://doi.org/10.5281/zenodo.2597061

www.openrisknet.org

Helpful tools

www.openrisknet.org

First questions to audience:1. Are you familiar with web services descriptions using swagger /

OpenAPI? (yes/no)2. Are you aware of the concept of annotation with ontologies? (yes/no)3. Did you ever semantically annotate data or other things? (yes/no)

www.openrisknet.org

Short intro to ontologies and semantic annotation

www.openrisknet.org

Short intro to ontologies:

Evolved from the interoperability requirement for system integration• data heterogeneity• semantic heterogeneity

An ontology is a data model that represents a domain and is used to reason about the objects in that domain and the relations between them

An ontology is a (partial) specification of a shared conceptualization, i.e., it is usually a logical theory that expresses the conceptualization explicitly in some language. A conceptualization can be defined as an intensional semantic structure that encodes implicit knowledge constraining the structure of a piece of a domain.

www.openrisknet.org


The use of ontologies began in the biological sciences around 1998 with the development of the Gene Ontology (GO). By 2007, there was sufficient interest and activity in the area to merit national and international coordination efforts such as the Open Biomedical Ontologies (OBO) Foundry or the National Center for Biomedical Ontologies. Provided by George Gkoutos, UoB

www.openrisknet.org


The backbone of ontology is often a taxonomy. Taxonomy is a classification of things in a hierarchical form. It is usually a tree or a lattice that expresses subsumption relation - i.e., A subsumes B meaning that everything that is in A is also in B. An example is classification of living organisms.

www.openrisknet.org


www.openrisknet.org


The modular design uses inheritance of ontologies - upper ontologies describe general knowledge, and application ontologies describe knowledge for a particular application

www.openrisknet.org


Provided by Egon Willighagen, UM

www.openrisknet.org

Short intro to semantic annotation: Resource Description Framework (RDF)

Resource Description Framework (RDF) is a framework for representing information about resources in a graph form. Information is represented by triples subject-predicate-object.

https://www.obitko.com/tutorials/ontologies-semantic-web/resource-description-framework.html

https://www.obitko.com/tutorials/ontologies-semantic-web/resource-description-framework.html

www.openrisknet.org

RDF triples in JSON-LD

www.openrisknet.org

Swagger / OpenAPI descriptions:

www.openrisknet.org

OpenAPI descriptions as JSON files

www.openrisknet.org

OpenAPI + JSON-LD

Subject or object

Predicate

www.openrisknet.org

Why use ontologies in OpenRiskNet?

● Inputs, outputs and operations that are semantically identical should be understood to be so, despite different spellings etc.

● Ontologies are richer than controlled vocabularies as they allow reasoning (mainly used here for forming Subject-Predicate-Object statements right now)

● Several different ontologies can be used that view related concepts from different points of view

www.openrisknet.org

Second questions to audience:1. Are you interested in a more detailed introduction to ontologies and

semantic web/SPARQL? (yes/no)2. Which form would you prefer? (webinar/f2f/both)

www.openrisknet.org

Goal 1: Make OpenRiskNet services identify themselves

www.openrisknet.org

OpenRiskNet infrastructure components

1. The OpenRiskNet service catalogue listing all compliant service (https://openrisknet.org/e-infrastructure/services/)

2. A discovery service listing all running services on a VRE (e.g. https://home.prod.openrisknet.org)

3. A SPARQL query tester to access the semantic annotation (http://orn-registry-openrisknet-registry.prod.openrisknet.org/)

All these support services need to know, what services are out there.

https://openrisknet.org/e-infrastructure/services/

https://home.prod.openrisknet.org/

http://orn-registry-openrisknet-registry.prod.openrisknet.org/

www.openrisknet.org

Registration of services as simple as possible

More details can be found at: https://github.com/OpenRiskNet/home/wiki/Annotating-API-to-make-it-queryable

No comprehensive ontology available for the service types right now. Collection of needed terms and integrated into one existing ontology ongoing. “x-...” can be used as placeholder.

https://github.com/OpenRiskNet/home/wiki/Annotating-API-to-make-it-queryable

www.openrisknet.org

Similar for individual API endpoints

www.openrisknet.org

And to make it discoverable in OpenShiftThe OpenRiskNet registry runs in the OpenRiskNet VRE and listens for service and routes starting or stopping. When one starts the registry inspects the OpenAPI definition and adds it to the list of known services. When it stops it is removed from the list.

To be recognised by the ORN registry your service needs to be annotated like this:

More information at: https://github.com/OpenRiskNet/home/wiki/Annotating-your-service-to-make-it-discoverableand in an upcoming webinar on service deployment

https://github.com/OpenRiskNet/home/wiki/Annotating-your-service-to-make-it-discoverable

www.openrisknet.org

This can already produce nice listings:



www.openrisknet.org

Goal 2: Information on capabilities of a service

www.openrisknet.org

Semantic annotation of capabilitiesOpenRiskNet provides information on the service capabilities on two extreme levels:● High level for identifying relevant services● Low level for extracting specific data from API endpoints

www.openrisknet.org

OpenRiskNet extensions:

On the highest level

OpenAPI standard:

OpenTox specifications:

x-orn-@type: "x-orn:Service" ,

x-orn:expects: {

x-orn-@id: "x-orn:OperrationParameters"},

x-orn:returns: {

x-orn-@id: "x-orn:JaqpotModelingTaskId"}

operationId: "trainModel"

- Compound- Feature- Dataset- Algorithm- Model- Validation- Report- Task

www.openrisknet.org

Example: ToxCast dataset

www.openrisknet.org

Finding Edelweiss datasets

www.openrisknet.org

Low level: data schema

www.openrisknet.org

Return values - OpenAPI schemas

www.openrisknet.org


www.openrisknet.org


www.openrisknet.org

Corresponding data

www.openrisknet.org

Context block

www.openrisknet.org

Becoming more specific: IC50 determined by hill model fitting using the tcpl library

https://github.com/OpenRiskNet/home/wiki/Modelling-IC50-results

https://github.com/OpenRiskNet/home/wiki/Modelling-IC50-results

www.openrisknet.org

Query tester



www.openrisknet.org

Substancesubtree

Figure of full subtree can be found here:https://drive.google.com/open?id=1FSqerIWtZGhRrW2aRej8rdpAQ2X0SDzG

https://drive.google.com/open?id=1FSqerIWtZGhRrW2aRej8rdpAQ2X0SDzG

www.openrisknet.org

www.openrisknet.org

www.openrisknet.org

Goal 3: Easy access to additional information

www.openrisknet.org

There is never too much metadata

Proposal: Each OpenRiskNet service is equipped with an additional API endpoint providing metadata.

Standard fields:

● Provider of service● Link to service description● Links to manuals and tutorials● Links to scientific literature● Links to download page of container

www.openrisknet.org

But we can do more!

https://www.epa.gov/sites/production/files/2015-08/documents/toxcast_assay_annotation_data_users_guide_20141021.pdf



www.openrisknet.org

www.openrisknet.org

Conclusion

Goal 1: Identification of OpenRiskNet services

Goal 2: Capacities of a service

● High level description● Annotation of return values

Goal 3: Additional informations

www.openrisknet.org

Acknowledgements

OpenRiskNet (Grant Agreement 731075) is a project funded by the European Commission within Horizon 2020 Programme

Project partners:P1 Edelweiss Connect GmbH, Switzerland (EwC)P2 Johannes Gutenberg-Universität Mainz, Germany (JGU)P3 Fundacio Centre De Regulacio Genomica, Spain (CRG)P4 Universiteit Maastricht, Netherlands (UM)P5 The University Of Birmingham, United Kingdom (UoB)P6 National Technical University Of Athens, Greece (NTUA)P7 Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.,

Germany (Fraunhofer)P8 Uppsala Universitet, Sweden (UU)P9 Medizinische Universität Innsbruck, Austria (MUI)P10 Informatics Matters Limited, United Kingdom (IM)P11 Institut National De L’environnement Et Des Risques INERIS, France (INERIS)P12 Vrije Universiteit Amsterdam, Netherlands (VU)

www.openrisknet.org

Webinars series https://openrisknet.org/events/

Topic Date & Time

Past events

Introduction sessions to the OpenRiskNet e-infrastructure

Webinar recordings:● Session 1 (24 Sep 2018)● Session 2 (27 Sep 2018)● Session 3 (4 Oct 2018)● Session 4 (30 Oct 2018)

Learn how to deploy the OpenRiskNet virtual research environment Webinar recordings (25 Feb 2019)

Demonstration on data curation and creation of pre-reasoned datasets in the OpenRiskNet framework Webinar recordings (18 Mar 2019)

Identification and linking of data related to AOPWiki (an OpenRiskNet case study) Webinar recordings (26 March 2019)

The Adverse Outcome Pathway Database (AOP-DB) Webinar recordings (8 April 2019)

Current event How to describe OpenRiskNet services and their functionality by semantic annotation

Monday, 13 May 2019, 16:00 CESThttps://openrisknet.org/events/64/

Future events

Use of Nextflow tool for toxicogenomics-based prediction and mechanism identification in OpenRiskNet e-infrastructure

Monday, 27 May 2019, 16:00 CESTRegistration: https://openrisknet.org/events/65/

Case study demo: ModelRX Tuesday, 11 June 2019, 16:00 CEST

Application deployment Date to be announced

Additional demo sessions on case studies June-July 2019 (to be announced)












Documents

How to describe OpenRiskNet services and their functionality by … · 2019. 5. 13. · JSON-LD, where the later opens up the world of ontologies, semantic web and Resource Description