Semantics and linked data at astra zeneca

Preview:

Citation preview

Semantics and Linked Data at AstraZeneca R&D

Kerstin ForsbergSemantics@Roche 2016

On my agenda:

1. An internal example: CI360 (Competitive Intelligence)

On my agenda:

2. Our Public, pre-competitive engagement, examples:

Open PHACTS, PhUSE, MedDRA and WHO ATC

On my agenda:

3. Our Linked Data Community of Practice (LD CoP)

On my agenda

4. Ongoing work: URI Recommendations

Who am I?

9

My goal: “Improving the utility of clinical trial data. Making it easier

to use data and to be able to take informed decisions when

combining data across clinical trials.”

Industry expert in semantic interoperability, clinical data standards

and medical terminologies.

My first semantic web article from 2000: “Extensible use of RDF in

a business context”.

1.900+ followers on social media (@kerfors) including FDA’s Chief

Health Informatics Officer, AstraZeneca’s CIO and HL7’s CTO.

Contributor to PhUSE, CDISC, W3C HCLS, IMI EHR4CR, MedDRA

MSSO and Advisory Board member for SALUS (Post Market Safety

Studies) EU project.

Informatics Analyst and Lifetime learner

Interesting times: Open Data, Open Innovation

10

11

See: http://www.bio-itworld.com/2016/6/10/astrazeneca-CI360-wins-bio-itworld-knowledge-management.aspx

1. An internal example: CI360 (Competitive Intelligence)

Competitive Intelligence 360 (CI360) Approach

Flexibly Addressing Key Questions

Capture Business Questions and

Sources

Domain Expert Concept Map

Build Formal Ontology

Challenge with Linked Data

Examine with a Faceted Browser

Share insights with a Knowledge

Base

Capture Business Questions

13

Capture Business Questions and

Sources

Translate Questions into Concepts

14

Domain Expert Concept Map

“Where are the key clinical studies in NSCLC and who are the principle investigators?”

Challenge with Data

“Where are the key clinical studies in NSCLC and who are the principle investigators?”

(one example)

Source: https://clinicaltrials.gov/ct2/show/NCT02027428

Challenge with Linked Data

Refine the Answer Examine with a Faceted Browser

“What are the open trials in metastatic breast cancer and what drugs are being tested?”

Share Insights as a Community“Can a biomarker defined population be added to a trial record?”

Share insights with a Knowledge Base

2. Our Public, pre-competitive engagement, examples:

Open PHACTS, CDISC/PhUSE, MedDRA and WHO ATC

The Open PHACTS Discovery Platform

19

The Open PHACTS Foundation and Uptake at AstraZeneca

20

21

CDISC and PhUSE Semantic Technology

• CDISC2RDF, Oct 2012 a pre-competitive project with AZ,

Roche, W3C et al. to show case Semantic Web

standards and Linked Data principles.

• FDA meeting Nov 2012: Solutions for Study Data

Exchange Standards Meeting – W3C presentation.

• June 2013 the Semantic Technology project, a

FDA/PhUSE working group for Emerging Technologies,

with 25+ repr. from FDA, CDISC, Pharma:s, CRO:s and

software vendors.

• Oct 2013 press release: Representing existing standards

(SDTM, CDASH,SEND, ADaM) in RDF.

• Dec 2014, Public review of CDISC in RDF Guide.

• July 2015, Published on http://www.cdisc.org/rdf and

https://github.com/phuse-org/rdf.cdisc.org

CDISC Interchange Europe

2011 and 2012

presentations from

Roche and AstraZeneca

3. Our Linked Data Community of Practice (LD CoP)

24

6-7 seminars per year

SharePoint site and

Chatter group

Mailing list with 50+

colleagues across IT and

business

Friends and thought leaders

in semantic web and linked data

LD CoP 16 Sept: Presentation Armando Oliva, prev. FDA

25

LD CoP 16 Sept: Presentation Armando Oliva, prev FDA

26

4. Ongoing work

Identifying Studies via URI:s and programmatic access/APIs

27

Global, persistent, resolvable identifier (AZT_ID) as URI:s

http://clinicaltrials.rd.astrazeneca.net/study/D5896C00725

PREFIX azt: < http://clinicaltrials.rd.astrazeneca.net/study/>

azt:D5896C00725

MLCSMS_STUDY_ID

IMPACTTRIAL_NO

CT.GOVNCT_NUMBER

Data integration/virtualisation

Look up/Master table

connecting AZT_ID:s to database record identifier:s

Resolved via a Look-up Study API

http://clinicaltrials.rd.astrazeneca.net/api/v1/study?azt_id=D5896C00725

Alt. search of the same study e.g.

http://clinicaltrials.rd.astrazeneca.net/api/v1/study?studyname=SD-039-0725

http://clinicaltrials.rd.astrazeneca.net/api/v1/study?studyname=SPROUT

http://clinicaltrials.rd.astrazeneca.net/api/v1/study?studyname=NCT00646321

Surface data

about a study

across databases

URI:s = Uniform Resource Identifier:s

Key to link data about resources/entities in a robust way

Checkout Linked Data Principles

4. Ongoing work

URI Recommendations and VoID dataset descriptions

28

AZ/MedImmune

Linked Data Community

Tom Plasterer

Ola Engqvist

Rajan Desai

Jeff Saltzman

David Ruau

Kathy Reinold

Johan Almström

29

Thanks

Key Influencers

David Wood

Lee Harland

Bryn Williams-Jones

Eric Neumann

Dean Allemang

Barend Mons

Carole Goble

Bernadette Hyland

Bob Stanley

Michel Dumontier

John Wilbanks

Linked Data in One slide

Confidentiality Notice

This file is private and may contain confidential and proprietary information. If you have received this file in error, please notify us and remove

it from your system and note that you must not copy, distribute or take any action in reliance on it. Any unauthorized use or disclosure of the

contents of this file is not permitted and may be unlawful. AstraZeneca PLC, 1 Francis Crick Avenue, Cambridge Biomedical Campus,

Cambridge, CB2 0AA, UK, T: +44(0)203 749 5000, www.astrazeneca.com

31

Recommended