Upload
kerstin-forsberg
View
801
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Linked Data
“I’m encouraged by what actually can be done to improve the research and commercial utility of information.”
Here’s an intro to why …
Kerstin ForsbergAstraZeneca R&D, Sweden
Introduction
Linked Data Introduction 2
Web of (Linked) Data
Web 3.0
Web of Documents
An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later , by Samantha Wong Image Source: Frederic Martin
Linking Open Data (LOD) cloud
The Linking Open Data cloud diagram
Linked Data Introduction3
http://data.gov.uk/
http://data.gov/
Two Forerunners: UK and US Government
Linking Open Data (LOD) cloud
4
Global Identifier (URI) for a AZ study http://data.linkedct.org/resource/trial/NCT00755378
48 RDF Triples
Linked Data Introduction
Linked Data ClinicalTrial.gov
4 Principles for Linked Data …… and 5 stars for Linked Open Data
5
1. Use URIs (Uniform Resource Identifiers) as names for things.
2. Use HTTP URIs so that people can look up (dereference) those names.
3. When someone looks up a URI, provide useful information.
4. Include links to other URIs so that they can discover more things.
Source: Linked Open Data star scheme by example
More resources introducing and describing the Linked Data idea
Linked Data Introduction
Linked Enterprise Data
6 Linked Data Introduction
Source: What does Open Data mean for Enterprises?
More resources introducing and describing the Linked Data idea
“Linked R&D Data”Examples of Building Blocks
7
From the AZ RDI report: “Persistent URIs and Linked Data “ordered by Mike Westaway
• Global identifier (URI) scheme for AZ entities (e.g. people, projects, studies, drugs)
• Recommended biomedical ontologies and basic vocabularies to provide context (semantic and provenance) to data
• Dataset Catalogue
Linked Data Introduction
I’m encouraged by …
• … what actually can be done by applying Linked Data principles, together with a stepwise implementation and pragmatic application of crucial building blocks, to …
• … improve the research and commercial utility of information• Organized for associations • Prepared for not yet defined use• Ready for automation where computers
can function alongside us to • Mitigate the complexity in discovering,
accessing, connecting and interpreting information
• Improve the productivity in managing information
Linked Data Introduction 8
Health Care and Life Sciences (HCLS) Interest GroupLinking Open Drug Data
EU project The Large Knowledge ColliderLinked Life Data
A 2-page summary of our learnings from participating in these external projects: Linked Data in Pharma, 2011, Bo Andersson and Kerstin Forsberg
Extras
• One example- Spending Data in UK
• Two things to remember- RDF Triples - Global Identifiers (URIs)
• Scenarios• Linked Clinical Study Metadata• Linked Patient Data in a Clinical Study
Linked Data Introduction (Extras)9
Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
One example
10Linked Spending Data –
How and Why Bother
http://spending.lichfielddc.gov.uk/spend/8605670
Globally identified by a URI
From the Linking Open Data cloud
A Payment from Lichfield District Council, one local authority in the UK.
Linked Data Introduction (Extras)
Spending Data in UK
Two things to remember RDF Triples
has the colorThe sky blue
Example from a text book
commentThe property Net Amount “The net amount of the payment. This is the effective cost to the payer after any reclaimable tax has been deducted.”
Triples for the standards that provides the semantics
subclass ofThe class Expenditure Line Observation in a multi-dimensional data cube for statistics
subject predicat objectResource
DescriptionFramework
Payment number 8605670
Payment number 8605670
Payment number 8605670
Net Amount 120.00
Example from the Spending data example in UK
Type Expenditure Line
Payer Lichfield District Council
Lichfield District Council Type Local Authority
11 Linked Data Introduction (Extras)
Two things to remember Global Identifiers
Examples of Identifiers for “types of things” / “standards for things”
http://reference.data.gov.uk/def/payment#netAmount
http://reference.data.gov.uk/def/payment#ExpenditureLine
http://purl.org/linked-data/cube#Observation
http://statistics.data.gov.uk/def/administrative-geography/LocalAuthority
UniformResourceIdentifier
URI
Examples of Identifiers for “things” in Spending data example
http://spending.lichfielddc.gov.uk/spend/8605670
http://statistics.data.gov.uk/id/local-authority/41UD
12 Linked Data Introduction (Extras)
Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
One example – “under the hood”
Live view using the Web Data Inspector
2 of 10 RDF Triples
13 Linked Data Introduction (Extras)
Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
One example – with a top-down approach to standardization of the semantics
Live view using the Web Data InspectorThe Payment Ontology
14 Linked Data Introduction (Extras)
Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
UK government: Top-down approach to standardization for Spending Data
Statistical Data perspectiveLinked Data Cube Vocabulary
Payment Ontology
15
Guide to the Payments Ontology
The RDF Data Cube vocabulary
Presentation: Statistical Data in RDF
Linked Data Introduction (Extras)
Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
Scenario: Linked Clinical Study Metadata
http://clinial.data.astrazeneca.com/id/study/D8180C00011
What would we like to see as thelinked data description of it?
What would we like to see on a internal
webpage presenting linked data describing a clinical
study?
http://data.linkedct.org/resource/trial/NCT00755378 owl:sameAs
Linked Data Introduction (Extras)16
http://reference.cdisc.org/ct/sdtm/TSPARAMCD#ROUTE
Internal categorization to support Design &
Interpretation decisions
http://clinical.reference.astrazenenca.com/DI/SIZE#LARGE
Scenario
• If each AZ clinical study had a global identifier/URI (could be something like this http://data.astrazenenca.com/id/clinicalstudy/D8180C00011/ similar to what exist already today for studies in ClinicalTrial.gov e.g. http://data.linkedct.org/resource/trial/NCT00755378
• If each identified Observation in a clinical study dataset delivered to AZ had a global identifier, e.g. http://data.astrazenenca.com/data/observation/D8180C00011/20000034
• If each individual Observation via a RDF triple linked to its global identified test procedure, similar to what exists today for CDISC’s SDTM submission values e.g. hemoglobin measurement http://linkedlifedata.com/resource/umls/id/C051801
• If each individual Observation via a RDF triple linked to contextual information, similar to what exists today for CDISC’s SDTM submission values for e.g. http://linkedlifedata.com/resource/umls/id/C0038846 - prefLabel ‘Supine Position’. (Hopefully, CDISC, together with NCI, will in the future publish their standards in a similar way to make it easier to link clinical data.)
• If each identified Observation also was delivered with its provenance information, for example two RDF triples expressing references to the identified measurement device and to the SOP document being used. For more details see the provenance vocabulary http://trdf.sourceforge.net/provenance/ns.htm
Linked Patient Data in a Clinical Study
17 Linked Data Introduction (Extras)