Technical "how to" training course on use of the Linked Data registry and the associated data conversion service.
Citation preview
Registry Environment registry service Technical training
Registry Course scope Enable participants to: prepare a code
list for registration register a code list manage a registered list
access a code list Out of scope delegation and proxy support
detailed update API general data modelling system deployment and
management Preamble
Registry Prerequisites Required: familiar with material from
introductory webinar notion and purpose of registry high level
information model Preferable: some familiarity with JSON syntax
Helpful but not necessary: some knowledge of RDF and Turtle syntax
Preamble
Registry Course structure 1. Preamble 2. End to end example 3.
Design and data preparation 4. Publication 5. Managing entries 6.
Accessing content 7. Advanced cases (some optional) 8. Wrap up
discussion Design Data preparation Publish Manage Access
Preamble
Registry End to end example Demonstration csv code list -
rbd.csv convert upload set status view download Demo
Registry Design and data preparation Case: local code list to
be published openly no existing URIs (see advanced section for
external entity case) Topics to cover: URI structure standard code
list cases data conversion utility simple code list hierarchical
code list organizations representation and vocabularies custom code
lists small extensions direct formatting see advanced topics
later
Registry URI structure Why? URIs are opaque so technically
doesnt matter but predictable patterns help data users URI
structure ties to administration structure Formal guidance:
http://tinyurl.com/UKGovLD-revisedUriPatterns
http://{domain}{/collection*}
[/id][/{concept}/{key}]*[/{concept}][#id] - uri sets
[/def]{/vocabulary*}[/{term}][#{term}] - vocabs Design and data
preparation
Registry URI structure Registry convention treat as
vocabularies (concepts) use top level sub-collections to ease
management http://environment.data.gov.uk/registry
/def/{collection}/{code-list}
http://environment.data.gov.uk/registry
/def/{collection}/{code-list}/{code} Design and data
preparation
Registry URI structure {collection} choose a name to reflect
the nature of the lists that can be kept stable that is legal in a
URI segment registry admin/organization SRO creates, delegate to
publisher example: catchment-planning {code-list} chose stable,
legal, name reflecting nature of the list often noun reflecting
type of the entity in the list example: RiverBasinDistrict {code}
unique identifier for the entry in the list typically available as
an id or notation, otherwise mangle label Design and data
preparation
Registry URI structure Registry upload allows relative URIs
create list using {code-list}/{code} then upload to:
http://environment.data.gov.uk/registry /def/{collection} Design
and data preparation
Registry SPARQL Standard cases ETL JSON Code list server proxy
requests Registry service Existing local code lists registry-util
converter [CSV] RDF Design and data preparation
Registry Standard cases Supported by data conversion tool
http://environment.data.gov.uk/registry-util/ predefined templates
for some common formats more templates can be added form for
metadata preview raw converted data generates (RDF) file ready for
publication Design and data preparation
Registry Standard cases Simple, flat code list simple-skos
labelled-skos (if no notation column) CSV structure: label
description notation note definition label used for presentation
and UIs explanation of code (text) notation used in data
supplementary note (text) formal definition (text) Required
Optional Optional (can use label) Optional Optional Design and data
preparation
Registry Standard cases Hierarchical code list
hierarchical-skos CSV structure: label1 label2 label3 description
notation note definition label (top level code) explanation of code
(optional) notation used in data (optional) (optional) label
(second level code) explanation of code (optional) notation used in
data (optional) (optional) label (third level code) explanation of
code (optional) notation used in data (optional) (optional) Design
and data preparation
Registry Standard cases Simple organization
two-level-organization CSV structure:org suborg description label
of a parent organization description of the organization label of
sub-organization description of the sub- organization label of
sub-organization description of the sub- organization Design and
data preparation
Registry Standard cases Will demonstrate shortly ... But first
look at the representation details helpful background understand
conversion previews only strictly necessary if developing custom
lists Design and data preparation
Registry Representation and vocabularies What do you want to
say about a code? How should it be represented in the data? Recall
information model: each registered item is identified by a URI
described by a set of property values each property is itself
identified by a URI standard vocabularies of useful properties
open, can freely add properties mandatory minimum is a type and
label Design and data preparation
Registry Aside on notation Prefix notation avoid writing long
URIs for types and properties prefix:local for example: rdfs:label
skos:Concept prefix maps to a namespace URI related to qnames in
XML but just concatenation skos:Concept =
http://www.w3.org/2004/02/skos/core#Concept Design and data
preparation
Registry Common prefixes Registry preloaded with common
prefixes [Just another register, so can extend] Prefix Vocabulary
Examples rdf RDF core rdf:type rdfs RDF schema rdfs:label,
rdfs:comment skos Simple Knowledge Organization Scheme
skos:Concept, skos:prefLabel, skos:broader dct Dublin core terms
dct:description, dct:publisher org Organization ontology
org:Organization reg Registry vocabulary reg:Register ldp Linked
data platform ldp:Collection ldp:membershipPredicate xsd Xml Schema
Datatypes xsd:string Design and data preparation
Registry Representation and vocabularies Standard templates
SKOS examples a single code row is given the properties: Property
value rdf:type skos:Concept skos:prefLabel label rdfs:label label
skos:notation notation or clean(label) dct:description description
skos:note note skos:definition definition skos:inScheme
skos:topConceptOf if top level skos:narrower Design and data
preparation
Registry Example Hierarchical code list look at vehicles.csv
create project and upload select template fill in metadata convert
browse Design and data preparation Demo
Registry Custom code lists Sometimes want a richer
representation custom types for entries additional properties
correspondence mappings whole different representation Various
options to achieve this: request additional templates for the
registry-util generic property columns custom data generation (see
later) Design and data preparation
Registry Generic property columns Situation standard SKOS
template is mostly fine but want to get entries additional types or
additional properties Solution add column in sheet for additional
property column name give URI for property in if column value is in
then its treated as a URI else its a literal (string, number, date)
Design and data preparation
Registry Generic property columns - example River Basin
Districts would like to also type each entry to match WFD
vocabulary Design and data preparation
Registry Publication Security model Upload forms
Registry Security model Authentication not needed to read and
browse username/password or OpenID (e.g. Yahoo) can set password
for OpenID as a backup register using OpenID provider or
email/password up to administrator to grant you permissions
Authorization rights granted to Register or Item rights on Register
inherit to sub-registers/items manager role register, update,
status-update, grant maintainer role update, grant can set a
Register fully open - any registered user can update
Publication
Registry Publish a prepared registration Login Publication
Registry Publish a prepared registration Login Publication
Registry Publish a prepared registration Navigate to location
then Admin > Add registration Publication
Registry Publish a prepared registration Upload prepared file
Publication
Registry Publish a prepared registration Upload prepared file
Publication
Registry Publish a prepared registration Other registration
actions upload an individual entry upload a register with its
initial contents upload a register of external items create a
register forward the URL (advanced usage only) manually create an
entry (not recommended) Publication
Registry Managing entries Setting status Manual corrections
Uploading corrections API access, not covered here future support
for correcting by CSV round trip
Registry Setting status Registry follows generalized ISO19135
status model Managing entries
Registry Setting status Visibility only the accepted
codes/registers are visible to users administrators see all status
entries our current upload is submitted and so not visible might be
some formal review process before it gets promoted To set status
navigate into new register Admin > Set status Admin > Set
content status only offers legal state transitions Managing
entries
Registry Accessing data Viewing item or register in browser
Fetch register or item in machine readable form RDF (Turtle,
RDF/XML) JSON (JSON-LD) Fetching metadata Versioning and
history
Registry Real registry information model Simplified registry
information model register is a list of entries which we've called
items easy for consumers to work with can publish data using
standard templates without having go beyond this model Real
information model separate metadata about the entry from the entry
itself the RegisterItem has all the metadata this in turn points to
some Entity which may be external to the registry Accessing
data
Registry Register RegisterItem label description status
submitter item class date submitted etc ... register register
entitydefinition EntityReference entitydefinition EntityReference
RegisterItem label description status submitter item class date
submitted etc Real registry information model Accessing data
Registry Registry information model Every RegisterItem has a
notation URI for the RegisterItem is:
http://environment.data.gov.uk/registry/
/def/{collection}/{code-list}/_{item} So for registers which list
external resources the RegsterItem URI is in the registry namespace
the entry itself can be in another namespace Accessing data
Registry Registry information model Actually more complex due
to versioning each RegisterItem can have a history of changes can
retrieve specific earlier version using
http://environment.data.gov.uk/registry/
/def/{collection}/{code-list}/_{item}:version Accessing data
Registry Registry information model Hide the complexity
versioning and metadata are necessary features but for common uses
just want to see the list of current entries in their current state
so the default view (via browser, or simple GET) constructs
simplified view Accessing data
Registry Registry information model Real structure Accessing
data reg:Register entity reg:register reg:definition reg:entity
reg:RegisterItem reg:EntityReferencereg:RegisterItem
reg:EntityReference
Registry Registry information model Simplified container view
property which links register to entity can be configured, can
point either way Accessing data reg:Register entity reg:register
reg:definition reg:entity reg:RegisterItem
reg:EntityReferencereg:RegisterItem reg:EntityReference induced
membership relation default is rdfs:member container view full
view
Registry Registry information model Hide the complexity
versioning and metadata are necessary features but for common uses
just want to see the list of current entries in their current state
so the default view (via browser, or simple GET) constructs
simplified view but if ask for with_metadata see the raw structure
Accessing data
Registry Accessing data Accessing individual entries just the
entry URI= http://environment.data.gov.uk/registry/def/catchment-
planning/RiverBasinDistrict curl -i H "Accept: application/ld+json"
"$URI/UK01" curl -i -H "Accept: text/turtle" "$URI/UK01" entry plus
metadata curl -i -H "Accept: text/turtle"
"$URI/UK01?_view=with_metadata" the RegisterItem (metadata) -
useful for external entries curl -i -H "Accept: text/turtle"
"$URI/_UK01" Accessing data
Registry Accessing data Fetch register or item in machine
readable form register plus all content curl -i -H "Accept:
text/turtle" $URI just register curl -i -H "Accept: text/turtle"
"$URI?non-member-properties" register plus all entries with status
submitted curl -i -H "Accept: text/turtle" "$URI?status=submitted"
register, entries and all metadata curl -i -H "Accept: text/turtle"
"$URI?_view=with_metadata" page through long list of entries curl
-i -H "Accept: text/turtle" "$URI?first-page" curl -i -H "Accept:
text/turtle" "$URI?_page=2" Accessing data
Registry Versioning History and versioning each thing in the
registry is versioned entity (code) item (metadata) registers
version management is done behind the scenes normal requests show
the current version the URI for the current version does not change
if the code changes meaning - create a new code Accessing data
Registry Versioning Accessing data //registry/reg/_foo
RegisterItem VersionedThing //registry/reg/foo (entity) rdfs:label
fixed //registry/reg/_foo:1 RegisterItem Version dct:versionOf
reg:definition //registry/reg/_foo:2 RegisterItem Version
version:currentVersion dct:replaces/dct:replacedBy
//registry/reg/foo (entity) rdfs:label wrong reg:definition
hasBeginning: 5 Mar 2014 17:24:25.362 hasEnd: 25 Jun 2014
14:14:59.080 hasBeginning: 25 Jun 2014 14:14:59.080
version:interval
Registry Versioning Accessing data
Registry Accessing data versions Accessing individual entries
current version URI=
http://environment.data.gov.uk/registry/def/ea-
organization/ea_areas curl -i -H "Accept: text/turtle" "$URI/_1-1"
specific version curl -i -H "Accept: text/turtle" "$URI/_1-1:2 list
versions curl -i -H "Accept: text/turtle"
"$URI/_1-1:3?_view=version_list Accessing data
Registry Picture CC-BY-2.0 Annie Roi @flickr.com
Registry Advanced Collections of external URIS Custom codes
lists in JSON - optional single entry whole register batch update
Not covered: patching forwarding and delegation restriction of
register content types
Registry Collection of external entities Use case create a
register whose entries are references to external codes or entities
that already have URIs Why endorse or qualify uses of the codes for
your purpose select a subset of some larger code list How create a
collection whose entries are external URIs and use Upload ref-batch
create and register entries with explicit RegisterItem metadata
External entities
Registry Worked example INSPIRE spatial data themes
http://www.eionet.europa.eu/gemet/inspire_themes?langco de=en
Process locate machine processable descriptions of the themes
http://inspire.ec.europa.eu/theme/
http://inspire.ec.europa.eu/theme/theme.en.json transform into a
suitable format pick out the subset to register register it
External entities
Registry Transform data to suitable format Two options via csv
flatten json to pick out information we want convert flat json to
csv [edit csv] convert csv to RDF using data converter tool via
jsonld convert json to jsonld [edit jsonld] External entities
Registry Transform data to suitable format Convert to CSV
flatten json e.g. use jq [ .register.containeditems[].theme | {
description: .description.text, definition: .definition.text,
label: .label.text, id: .id, notation: .id|ltrimstr(
"http://inspire.ec.europa.eu/theme/") } ] convert flat CSV to json
various online tools prepare for registration using dcutil External
entities
Registry Example INPSIRE themes example browse to json use jq
to pick out sections flatten jq look at csv create dcutil project
select external-skos template convert browse External entities
Demo
Registry Advanced Collections of external URIS Custom codes
lists in JSON - optional single entry whole register batch
update
Registry Custom code lists in JSON Custom code lists Why need
representation not covered by dcutil templates more complex data
than SKOS custom annotation and types How general property columns
(see earlier) prepare data using RDF tool chains (not covered)
prepare data as JSON compliant with JSON-LD Custom JSON
Registry JSON-LD format W3C specification supports mapping of
JSON data to RDF Supported by the registry provides JSON-LD
@context defining prefixes can directly import or export in JSON-LD
Custom JSON
Registry Custom code list cases single entry whole register
batch update Custom JSON
Registry Formatting a single entry The entry should be a json
object { "@context" :
"http://environment.data.gov.uk/registry/system/json-context",
"@id" : "entry", ... } Custom JSON Single entry imports all
registry prefixes gives a URI for the entry, typically relative URI
which will be relative to the register in which we place the
entry
Registry Formatting a single entry Now add some descriptive
properties { "@context" :
"http://environment.data.gov.uk/registry/system/json-context",
"@id" : "entry", "@type" : "skos:Concept", "rdfs:label" : "Entry",
"dct:description" : "I am an entry but described using JSON-LD" }
Custom JSON Single entry the type of the entry (translates to
rdf:type)
Registry Formatting a single entry Demonstration register this
as an entry in an existing register entry.jsonld note suffix
matters Custom JSON Single entry Demo
Registry Formatting a whole register Bulk registration options
create register plus contents in one go only legal for known types
bulkCollections Concept Scheme, Collection, Register need some
property to register to/from members Custom JSON Whole register
Type Membership property skos:ConceptScheme ^ skos:inScheme
skos:Collection skos:member reg:Register rdfs:member
Registry Formatting a whole register Raw register start with
json object for the Register resource { "@context" :
"http://environment.data.gov.uk/registry/system/json-context",
"@id" : "register", "@type" : [ "reg:Register" ], "dct:description"
: "A demonstration register", "rdfs:label" : "Register ... } we
will want more metadata, see later Custom JSON Whole register
Registry Formatting a whole register Raw register declare
members as further objects { "@context" :
"http://environment.data.gov.uk/registry/system/json- context",
"@id" : "register", "@type" : [ "reg:Register" ], "dct:description"
: "A demonstration register", "rdfs:label" : "Register
"rdfs:member" : [ { "@id" : "register/member1", "@type" :
"skos:Concept", "dct:description" : "I am the first member",
"rdfs:label" : "Member 1" }, { ... } ] } Custom JSON Whole register
declare members inline in an array (other options) member URIs have
to be children of the root URI
Registry Formatting a single entry Demonstration register this
whole register register.jsonld Custom JSON Whole register Demo
Registry Formatting a whole register skos:Collection case {
"@context" : "http://environment.data.gov.uk/registry/system/json-
context", "@id" : "collection", "@type" : [ "skos:Collection" ],
"dct:description" : "A demonstration collection of concepts",
"rdfs:label" : "Collection", "skos:member" : [ { "@id" :
"collection/member1", "@type" : "skos:Concept", "dct:description" :
"I am the first member", "rdfs:label" : "Member 1" }, { ... } ] }
Custom JSON Whole register Registry knows that skos:Collections use
a different membership property
Registry Formatting a whole register skos:ConceptScheme case
complicated because entries point to the container instead of the
other way round Custom JSON Whole register collection
skos:Collection skos:ConceptSchme collection/member1
collection/member2 skos:member scheme scheme/member1 scheme/member2
skos:inScheme
Registry Formatting a whole register skos:ConceptScheme case
complicated because entries point to the container so we cant nest
the json as a tree have to provide a set of resources, with links
between them use @graph to given an array of objects to link from
one resource to another use @id Custom JSON Whole register
Registry Formatting a whole register skos:ConceptScheme case {
"@context" : "http://environment.data.gov.uk/registry/system/json-
context", "@graph" : [ { "@id" : "scheme", "@type" : [
"skos:ConceptScheme" ], "dct:description" : "A demonstration
concept scheme", "rdfs:label" : "Scheme" }, { "@id" :
"scheme/member1", "@type" : "skos:Concept", "dct:description" : "I
am the first member", "rdfs:label" : "Member 1", "skos:inScheme" :
{ "@id" : "scheme" } }, { ... } ] } Custom JSON Whole register
Registry Formatting a whole register Metadata every register
should have metadata at least publisher license preferably rights
statement classification to help navigation which properties to
use? where to get the controlled (URI!) values? Custom JSON Whole
register
Registry Metadata Custom JSON Whole register Metadata Property
Source of values Publisher dct:publisher
http://environment.data.gov.uk/registry/structure/org License
dct:license http://www.nationalarchives.gov.uk/doc/open-government-
licence/version/2/ Rights dct:rights Nested resource specifying an
odrs:attributionText Classification - entity type env-ui:
entityType
http://environment.data.gov.uk/registry/structure/entity- type
Classification - category reg: category
http://environment.data.gov.uk/registry/structure/category
Registry Metadata Register example with metadata { "@context" :
"http://environment.data.gov.uk/registry/system/json-context",
"@id" : "register2", "@type" : [ "reg:Register" ], "rdfs:label" :
"Register2, "dct:description" : "A demonstration register",
"dct:publisher" : {"@id" :
"http://environment.data.gov.uk/registry/structure/org/department-for-
environment-food-rural-affair"}, "dct:rights" : {
"odrs:attributionText" : "Contains public sector information
licensed under the Open Government Licence v2.0." }, "dct:license"
: {"@id" : "http://www.nationalarchives.gov.uk/doc/open-
government-licence/version/2/"}, "reg:category" : {"@id" :
"http://environment.data.gov.uk/registry/structure/category/System"},
"env-ui:entityType" : {"@id" :
"http://environment.data.gov.uk/registry/structure/entity-type/Abstract"},
... } Custom JSON Whole register
Registry Batch update So far have seen: registering a single
entry registering an entire collection with contents Final case:
register a set of entries in an existing collection requires entry
+ item metadata allows for registration of external entries Custom
JSON Batch update
Registry Batch update (external case) Each entry needs item
plus entry itself { "@context" :
"http://environment.data.gov.uk/registry/system/json- context",
"@graph" : [ { "@id" : "_litre", "@type" : "reg:RegisterItem",
"reg:status" : { "@id" : "reg:statusStable" }, "reg:definition" : {
"reg:entity" : { "@id" : "http://qudt.org/vocab/unit#Liter" } } },
{ "@id" : "http://qudt.org/vocab/unit#Liter", "@type" :
"http://qudt.org/schema/qudt#VolumeUnit", "rdfs:label" : "Litre",
"dct:description" : "Non-SI unit of volume equal to 1 dm^3" }, ...
} Custom JSON Batch update Create an item in the register to
reference the entity Link the two together The external entity,
with core descriptive properties
Registry Course scope Enable participants to: prepare a code
list for registration register a code list manage a registered list
access a code list Out of scope delegation and proxy support
detailed update API general data modelling system deployment and
management Preamble
Registry Links Design and API details
https://github.com/UKGovLD/ukl-registry-poc/wiki Alpha site
http://environment.data.gov.uk/registry Data conversion service
http://environment.data.gov.uk/registry-util/ Training registry for
experimentation
http://registry-training.epimorphics.net/registry/
Registry Picture CC-BY-2.0 Annie Roi @flickr.com Wrap up
discussion
Registry Spares
Registry Updating an entry Options: manual edit through web
form PATCH over web API no UI for patch upload yet possible future
extension Patch provide new values for properties to be changed
omit ones that arent changing use PATCH rather than PUT/POST needs
API key access out of scope for this course Advanced Patch
Registry Convenient views full RegisterItem/Register structure
complex versioning makes that a lot worse //registry Register
VersionedThing //registry:1 Register Version //registry/_reg
RegisterItem VersionedThing //registry/_reg:1 RegisterItem Version
//registry/reg Register VersionedThing //registry/reg/_foo
RegisterItem VersionedThing //registry/reg/foo
(entity)//registry/reg:1 Register Version //registry/reg/_foo:1
RegisterItem Version dct:versionOf dct:versionOf dct:versionOf
dct:versionOf reg:register reg:register reg:definition
reg:definition //registry/_reg:2 RegisterItem Version //registry:2
Register Version //registry/reg:2 Register Version
//registry/reg/_foo:2 RegisterItem Version
Registry Conceptual architecture router renderer request
processor user credentials roles and bindings auth registry core
logic Registry RDF store text index style and templates external UI
admin UI log audit trail storeAPI nginx proxy conf API
Registry Structure information model managed entity URL in
registry namespace registry holds master copy of the entity data
Register http://.../def/catchment-planning/RiverBasinDistrict/
Register Item
http://.../def/catchment-planning/RiverBasinDistrict/_UK05 Entity
http://.../def/catchment-planning/RiverBasinDistrict/UK05
Registry Structure information model referenced entity URL
external to registry (well, register) registry holds minimal copy
of data Register
http://.../def/catchment-planning/RiverBasinDistrict/ Register Item
http://.../def/catchment-planning/RiverBasinDistrict/_UK05 Entity
http://agency.gov.uk/RDB/Anglia
Registry Federation, delegation and namespaces Case 1: External
entities identifier published in different namespace want to
include it in authoritative list Solution: just register as a
referenced entity already seen this authoritative because its on
the list can record properties of the entity, and maintain history
no namespace management involved
Registry Referenced entities /local /id /local-authority
Registry External service e.g. opencommunities.org Hosted by LA
directly
Registry Case 2: Namespace allocation want someone else to
serve part of the registry namespace might be a single item or a
complete register sub tree e.g. allocating namespace in
location.data.gov.uk for serving INSPIRE spatial object identifiers
Solution: reg:NamespaceForward can be a redirect (30X) or proxy
(200) no constraints on whether target acts like a Registry target
ought to serve linked data with URIs in the right namespace, but
not required Federation, delegation and namespaces
Registry Namespace forward /local /id /local-authority Registry
External web site could be anything
Registry Federation, delegation and namespaces Case 3:
Federated register want someone else to run part of the registry
infrastructure but act like one big registry integrated search,
validation etc Solution: reg:FederatedRegister can be a redirect
(30X) or proxy (200) target endpoint must comply with Registry API
at least for search, validation and entity lookup
Registry Federation, delegation and namespaces Case 4:
Delegating a register some one else to serve the list of contents
of the register but they only have triple store, not full registry
implementation Solution: solution eg:DelegatedRegister specify
SPARQL endpoint and triple pattern to enumerate members
reg:DelegatedRegister reg:delegationTarget [1]
reg:enumerationSubject [0..1] reg:enumerationPredicate [0..1]
reg:enumerationObject [0..1]
Registry Delegated register /local /id /local-authority
Registry External SPARQL service
Registry Security model authentication OpenID (e.g. Google,
Google profile) authorization permissions Register, Update,
StatusUpdate, Force, Grant, GrantAdmin inherit down the tree e.g.:
Register,Update:/example/local can grant to known user or anyone
authenticated bundled into roles Maintainer Update, Grant Manager
Register, StatusUpdate, Update, Grant Authorized Register, Update,
StatusUpdate - for experimental areas Administrator - anything
Registry Practical session End to end example using
vehicles.csv data conversion register with
http://environment.data.gov.uk/registry-util/ create project use
shortname based on your inials upload vehicles.csv choose template
fill in metadata form run conversion and browse results data
publication register with TDB test registry yes, these are separate
accounts browse to def/webinar2 register results of data conversion
set status Practical your turn