View
221
Download
2
Category
Tags:
Preview:
Citation preview
Introduction to linked data
Gordon DunsirePresented at the Cataloguing and Indexing Group
Scotland seminar “Linked data and the Semantic Web: what have libraries got to do with it?”, Edinburgh,
National Library of Scotland, 17 June 2011
Overview
Relational recordsInfluence of RDA vocabulariesDisaggregated, distributed “records”Logical conclusion: simple metadata statement
RDFTriples, etc.
Linked dataChains, clusters
Title: Cataloguing is fun!
Author: Mary MacDonald
Content type:
Carrier type:
LCSH:
microfiche
text
Cataloging
Bibliographic record: 12345 Name authority record: 8765
Heading: MacDonald, Mary
Place of birth: Edinburgh
LCSH authority record: 5432
Heading: Cataloging
See also: Books
RDA content type record: 1234
Term: text
Definition: Content expressed through a form of notation for language intended to be perceived visually.
RDA carrier type record: 5432
Term: microfiche
Definition:A sheet of film bearing a number of microimages in a two-dimensional array.
8765
5432
1234
5432
9876
65443
Title: Cataloguing is fun!
Author:
Content type:
Carrier type:
LCSH:
Bibliographic record: 12345
8765
5432
1234
5432
Name authority record: 8765
Heading: MacDonald, Mary
Place of birth: 9876
12345 8765Author
8765 Place of birth 9876
8765 Heading “MacDonald, Mary”
9876 Name “Edinburgh”
9876 Country 4567
Stop! Ambiguous: link not safe.
Identifier: ok to link.
Linked data is not a new idea!
It extends concepts of authority control“Preferred” labelsChange once; link many times
Re-use of metadataMore than one “attribute” associated with a “heading”
E.g. Place of birth of person with name heading
Concepts can be applied to authority recordsAs well as bibliographic description records
Full extension leads to “record” dis-aggregationAll “records” in bibliographic control systems
Linked data and RDF
Resource Description Framework (RDF)Designed for machine-processing of metadata
at global scale24/7/365Trillions of operations per second
Everything must be dis-ambiguatedMachines are dumb
Simplicity helps!Machine-readable identifiers
RDF triple
Metadata expressed as “atomic” statementsA simple, single, irreducible statement
The title of this book is “Cataloguing is fun!”
Constructed in 3 parts“Triple”
The title of this book is “Cataloguing is fun!”Subject of the statement = Subject: This bookNature of the statement = Predicate: has titleValue of the statement = Object: “Cataloguing is fun!”
This book – has title – “Cataloguing is fun!”subject – predicate - object
Identifiers
Need unambiguous way of identifying each part of the triple for efficient machine-processingHuman labels (“This book”, “has title”) no good
Same thing, different labels; different things, same label
Exploit the utility of the URLMachine-readable, regular syntax, unambiguous
Uniform Resource Identifier (URI)
Uniform Resource Identifier
Can be any unique combination of numbers and lettersNo intrinsic meaning; it’s just an identifying label
Can look like a URLhttp://iflastandards.info/ns/isbd/elements/P1001But does not lead to a Web page (in principle ...)
RDF requires the subject and predicate of triple to be URIsObject can be a URI, or a literal string (“Cataloguing is
fun!”)
Namespaces
URI can be constructed from a base plus a unique, identifying suffixhttp://iflastandards.info/ns/isbd/elements/+ P1001
Base is known as a namespaceCan be abbreviated by human programmer
“isbd” = http://iflastandards.info/ns/isbd/elements/isbd:P1001
Machine expands abbreviation for processing
Everything as triples in RDF
Every aspect of the metadata must be expressed in RDF to be machine-processableMetadata about real-world objects (books,
people, etc.)Metadata about the predicates (definition, label,
scope, etc.)Common predicates apply to many types of thing
(human-readable label, etc.)High-level RDF namespaces (rdfs, owl, skos)
RDF is expressed in RDF (“bootstrap”)
RDF properties
Predicates are called properties in RDF“Verbal” part of the metadata statement
E.g. “A has title ...”, “B is author of C”, “D is embodiment of E”
Properties link specific instances of two thingsA = a specific book, B = a specific person, etc.... = a specific label, character string, annotation
=> a “literal”
Properties are the links in linked data, the pathways through the Semantic Web
Domains and ranges
A property can specify the types of thing it linksE.g. Bibliographic resources, Persons, Places, etc.Types of thing are RDF classes
A domain is the class of the subject of the propertyE.g. The domain of “is embodiment of” is Expression
(FRBR)A range is the class of the object of the property
E.g. The range of “is embodiment of” is Manifestation (FRBR)
Inferencing
RDF enables semantic inferencingDeducing additional, unstated triples from an
existing statement or set of statementsE.g. “D is embodiment of E” + “(is
embodiment of) has domain Expression” => “D is a Expression”
And “D is embodiment of E” + “(is embodiment of) has range Manifestation” => “E is a Manifestation”
The truth
There is no test of veracity for a single triple in RDFAnybody can say Anything about Anything (AAA)
Inferencing only tests for logical inconsistencyE.g. If it results in “E is a Manifestation” + “E is not a
Manifestation”Library linked data must choose and apply its
properties/links with careTo maintain our reputation for reliability, quality, etc.In a web of user-, machine-, and politically-generated
metadata
Thank you
To be continued ...
Recommended