17
Användningen av länkade data principer och semantic web standards i läkemedelsforskningen Kerstin Forsberg (@kerfors on Twitter, SlideShare etc.) AZ IT | R&D Information Kompetensförstärkning kring länkade öppna data - dialog, webbinarier och vitbok Webbinarium arrangerat av 18 mars 2014 (Ursäkta mixen av engelsk a och sve nska)

Lankade data Vinnova webbinarium

Embed Size (px)

DESCRIPTION

Slides to be presented at a webinar arranged by Metasolution as part of a Vinnova project http://metasolutions.se/2014/03/webbinarium-med-kerstin-forsberg-om-lankade-data-i-lakemedelsforskningen/

Citation preview

Page 1: Lankade data Vinnova webbinarium

Användningen av länkade data principer och semantic web standards i läkemedelsforskningen

Kerstin Forsberg (@kerfors on Twitter, SlideShare etc.)AZ IT | R&D Information

Kompetensförstärkning kring länkade öppna data - dialog, webbinarier och vitbok

Webbinarium arrangerat av18 mars 2014

(Ursäkta mixen av engelska

och svenska)

Page 2: Lankade data Vinnova webbinarium

AZIT | R&D Information

About AstraZeneca

• Alongside our own R&D, we partner with others, combining skills and resources to broaden the potential for successful innovation.

• We believe that only by working together with others who have a part of play in improving healthcare can real progress be made.

• We work closely with others in the healthcare community, including physicians and those who pay for healthcare, to understand their challenges and how we can combine skills and resources to achieve a common goal: improved health.

2 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013

Page 3: Lankade data Vinnova webbinarium

Länkade Data i Läkemedelsforskningen

Två exempel på hur AstraZeneca arbetar med europeiska forskningssamarbeten och internationella standard organisationer för att göra kemidata och kliniska studie data enklare att använda med hjälp av nästa generations web teknik.

3 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 Set area descriptor | Sub level 1

Page 4: Lankade data Vinnova webbinarium

Webben fyllde 25 år 12 mars

4 Kerstin Forsberg | Vinnova webbinar 18 mars 2013 AZIT | R&D Information

Web of (Linked) Data

Web of Documents

An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later , by Samantha Wong Image Source: Frederic Martin

Page 5: Lankade data Vinnova webbinarium

RDF (semantic web basen) fyllde 15 år 22 febr.

5 Kerstin Forsberg | Vinnova webbinar 18 mars 2013 AZIT | R&D Information

Web of (Linked) Data

Web of Documents

subject predicat object

Common Model (“Triples”)

Resource Description Framework

Page 6: Lankade data Vinnova webbinarium

The Project

The Innovative Medicines Initiative• EC funded public-private

partnership for pharmaceutical research

• Focus on key problems– Efficacy, Safety,

Education & Training, Knowledge Management

The Open PHACTS Project• Create a semantic integration hub (“Open

Pharmacological Space”)…• Delivering services to support on-going drug

discovery programs in pharma and public domain• Not just another project; Leading academics in

semantics, pharmacology and informatics, driven by solid industry business requirements

• 23 academic partners, 8 pharmaceutical companies, 3 biotechs

• Work split into clusters:• Tehnical Build• Scientific Drive• Community & Sustainability

Page 7: Lankade data Vinnova webbinarium

Pre-competitive Informatics:Pharma are all accessing, processing, storing & re-processing external research data

LiteraturePubChem

GenbankPatents

DatabasesDownloads

Data Integration Data AnalysisFirewalled Databases

Repeat @ each

companyx

Lowering industry firewalls: pre-competitive informatics in drug discovery Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944

Page 8: Lankade data Vinnova webbinarium

ChEMBL DrugBankGene

OntologyWikipathways

UniProt

ChemSpider

UMLS

ConceptWiki

ChEBI

TrialTrove

GVKBio

GeneGo

TR Integrity

“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”

“What is the selectivity profile of known p38 inhibitors?”

“Let me compare MW, logP and PSA for known oxidoreductase inhibitors”

Page 9: Lankade data Vinnova webbinarium

Number sum Nr of 1 Question

15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse

18 14 8Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound?

24 13 8Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives.

32 13 8 For a given interaction profile, give me compounds similar to it.

37 13 8The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X.

38 13 8Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not).

41 13 8

A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature.

44 13 8 Give me all active compounds on a given target with the relevant assay data

46 13 8Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease)

59 14 8 Identify all known protein-protein interaction inhibitors

Business Question Driven Approach

http://www.sciencedirect.com/science/article/pii/S1359644613001542

Page 10: Lankade data Vinnova webbinarium

RDFNanopub

Db

VoID

Data Cache (Virtuoso Triple Store)

Semantic Workflow Engine

Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices

Identity Resolution

Service

Chemistry RegistrationNormalisation & Q/C

IdentifierManagement

Service

Indexing

Co

re P

latf

orm

P12374EC2.43.4

CS4532

“Adenosine receptor 2a”

RDF

VoID

Db

RDFNanopub

Db

VoID

RDF

Db

VoID

RDFNanopub

VoID

Public Content Commercial

Public Ontologies

User Annotations

Apps

Page 11: Lankade data Vinnova webbinarium

Platform Explorer

Standards

Apps

API

“Provenance Everywhere”

Page 12: Lankade data Vinnova webbinarium

Clinical data standards, today’s documentation

12 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information

Human readable documentation in 200+ pages PDF:s, Excel:s (and some in XML).

Page 13: Lankade data Vinnova webbinarium

Clinical data standards in the Semantic WebEnable end-to-end interoperable data standards for clinical research

13 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information

Page 14: Lankade data Vinnova webbinarium

Clinical data standards in the Semantic WebExample: 14 RDF triples describing one variable (“AEACN”)

14 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information

RDF triples describing one variable/data elementand also linked to related standard parts

Page 15: Lankade data Vinnova webbinarium

• CDISC2RDF started Oct 2012 as a pre-competitive project with AZ, Roche, W3C et al. to show case Semantic Web standards and Linked Data principles.

• FDA meeting Nov 2012: Solutions for Study Data Exchange Standards Meeting – W3C Semantic Web presentation

• June 2013 the Semantic Technology project, a FDA/PhUSE working group for Emerging Technologies, with 25+ repr. from FDA, CDISC, Pharma:s, CRO:s and software vendors.

• Oct 2013 press release: Representing existing standards (SDTM, CDASH,SEND, ADaM) in RDF.

Clinical standards in the Semantic WebCommunity building and knowledge sharing

15 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information

CDISC Interchange Europe 2011 and 2012presentations from Roche and AstraZeneca

Page 16: Lankade data Vinnova webbinarium

AstraZeneca’s view on “Semantics”

Enabling the hyperconnected enterprise

16 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013 AZIT | R&D Information

“We need to build a linked data architecture enabling us to ask questions and solve business problems across a heterogeneous information landscape extending beyond the traditional boundaries of the enterprise.”

semanticsconnectsusall

Page 17: Lankade data Vinnova webbinarium

AZIT | R&D Information

Acknowledgements

AZ’s Linked Data of Practice members:Tom Plasterer (lead), Jim Morris, Courtland Yockey, Sorana Popa, Rob Hernandez, Mike Westaway, Rajan Desai, Simon Rakov, Dana Crowley, Ian Dix, Johan Törnqvist

Collaborators and Advisors:• Charlie Mead – IO Informatics• Dean Allemang – Working Ontologist• Frederik Malfait – IMOS consulting / Roche• Phil Ashworth – TopQuadrant

17 Kerstin Forsberg | Vinnova webbinarium 18 mars 2013

Thank you! [email protected]