Produce and consume_linked_data_with_drupal

www.sti-innsbruck.at © Copyright 2008 STI INNSBRUCK www.sti-innsbruck.at

Produce and Consume Linked Data with Drupal

Stephane Corlosquet, Renaud Delbru, Tim Clark, Axel Polleres and Stefan Decker

Ioan Toma

www.sti-innsbruck.at

Acknowledge

• Many of the slides in this presentation are based on: http://www.slideshare.net/scorlosquet/produce-and-consume-linked-data-with-drupal?src=related_normal&rel=4796732


• There is a lot of data on the web in Content Management Systems (CMS)

• Moreover this data is structured data.• However,

• It is not possible to reuse this data outside the CMS (except RSS), but RSS limited when it comes to semantic

• This data is not available in a unified machine readable format

Motivation


• Goal: • integrate “any” CMS site to the Web

• Implementation in Drupal, why?:• One of the most popular CMS• Lots of extra functionality available as modules

• Approach in short: Develop a set of modules that perform:

1. Automatically site vocabulary generation

2. Mapping content models (site vocabulary) to existing vocabularies

3. Data endpoint for SPARQL querying

4. Lazy loading of external data (data import)

Approach


Approach


• Ontology based CMSs:• Semantic community Web portals (2000)• Model Driven Ontology-Based Web site management

Approach in the paper starts from existing CMS infrastructure

• Mapping RDBMS underlying CMS to RDF/RDFS

Approach in the paper starts from site model and constraint and not from underlying data base model

• SCF Node proxy architecture - RDF to Drupal mapping, not general, specific to bio domain

Approach in the paper has as starting point SCF Node proxy architecture

Related work


• Drupal:• Easy to use• Large community• Popular on the Web• Modular design

• Drupal terminology:• Node – corresponds to Drupal Web page• Module – functionality that alter and extend Drupal core functionality• Site administrators: set up the site and install modules they

like/need• Module developers: develop module(s)• Site editors: create the content of the site following the schema

defined by the site administrator

Drupal


• GUI for extending the internal schema of a Drupal site• Used on many Drupal sites• Can build new types of pages, known as content types• Can create fields for each content types. Fields can be of various types:

plain text fields, dates, email addresses, file uploads, references to other pages

Drupal: Content Construction Kit


Drupal: Content Construction Kit – User Interface






Approach

1, 2

4

3


• Automatic site vocabulary in RDFS/OWL:• Content types and fields are mapped to classes (rdfs:Class) and

properties (rdf:Property)• Label and descriptions of content types and fields are mapped to

rdfs:label and rdf:comment• Cardinality is mapped to cardinality restrictions in OWL

• Required – owl:cardinality 1• Maximum cardinality – owl:maxCardinality n• Domains and ranges of fields to rdfs:domain and rdfs:range

1. Site Vocabulary


• Import of any vocabulary published online• One needs to specify the URL of the vocabulary• By default FOAF, DublinCore, SIOC are imported

• External ontology search service• Entity centric search – returns the relevant classes, properties• Based on SWSE and Sindice

• Local terms are subclasses/subproperties of public terms• To ensure safe vocabulary reuse – avoid redefinition

2. Mapping Content Models to existing ontologies


2. Mapping Content Models to existing ontologies – RDF mapping page


2. Mapping Content Models to existing ontologies – RDF mapping page


• Local RDF data exposed in a SPARQL endpoint• Enables interoperability across sites• Build on the PHP ARC2 library• All RDF data index in the endpoint• Each page stored as a graph an kept up to date

3. Data endpoint for complex queries


• Lazy loading (caching) of distant RDF resources• Enables interoperability across sites• Build on the PHP ARC2 library• CONSTRUCT query to map distant schema to local schema

4. Lazy loading of external data


• Practical work to add RDF support to Drupal through a set of Drupal modules that do:

1. Automatically site vocabulary generation

2. Mapping content models (site vocabulary) to existing vocabularies

3. Data endpoint for SPARQL querying

4. Lazy loading of external data (data import)

Summary


• DERI approach does not use semantic repositories as a backend solution for storing RDF; we use OWLIM

• Things that might be relevant for us:• Mapping approach

• Lazy loading of data from external sources

Relation to OC work

Education

Produce and consume_linked_data_with_drupal