Comparative Survey-Java APIs For RDF

2009

Comparative survey ‐ Java APIs for RDF

Anisoara Sava, [email protected], OC2

Marcela Daniela Mihai, [email protected], OC2

2 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai

Table of Contents

1 Introduction .............................................................................................. 3

2 Jena ........................................................................................................... 3

2.1 The Model ......................................................................................... 6

2.2 Creating and serializing an RDF model ............................................. 7

2.2.1 Adding more complex structures ............................................ 11

2.3 Parsing and querying an RDF document ......................................... 14

2.4 In memory versus persistent model storage .................................. 16

2.5 Jena query mechanism evaluation ................................................. 18

3 SESAME ................................................................................................... 20

3.1 Introduction .................................................................................... 20

3.2 Licence ............................................................................................ 20

3.3 Architecture .................................................................................... 21

3.4 The Repository API .......................................................................... 23

3.5 Transactions .................................................................................... 23

3.6 Query Language .............................................................................. 24

4 Jena RDF storage ..................................................................................... 25

5 Sesame RDF storage ............................................................................... 26

6 Conclusions ............................................................................................. 26

Bibliography .................................................................................................... 28


1 Introduction

The Semantic Web is essentially a metadata management system. We may use metadata to describe everything: entities have sets of properties and relationships to various other classes. Languages such as RDF, RDFS and DAML provide markup tags that can be used to express (or assert) these relationships.

At the moment there are a lot of tools for creating and manipulating large scale ontologies. These tools can be classified in two sets: turnkey products and API/library implementations. Turnkey solutions, such as OntoMat and Protégé, can be used immediately to visually author metadata. But we may want some general API and libraries that we can use to construct our own applications. Jena and Sesame are such APIs for Java.

2 Jena

Java programmers who want to develop semantic web applications have a growing range of tools and libraries to choose from. One such tool, the Jena platform, is an open source API and toolkit for processing Resource Description Framework (RDF), Web Ontology Language (OWL) and other semantic web data.

Jena is accessible at Source Forge (http://sourceforge.net/projects/jena) or at http://www.hpl.hp.com/semweb/jena.htm. In addition, there’s a Jena developers’ discussion forum at http://groups.yahoo.com/group/jena‐dev/ .

The Jena platform has been developed by Hewlett‐Packard’s Semantic Web team. In fact, the cochair of the RDF Working Group is Brian McBride, one of the creators of Jena. Jena is derived from SiRPAC API. It can parse, create and search RDF models.

The Jena Framework includes:

a RDF API

4 Com

reapapro

an in‐ a S

The implealternativeprocessors

The most i

jenint

jenmo

jen

parative Survey

ading and wrser – ARP – oduct) OWL API ‐memory andSPARQL query

Figure 1

mentation hae processing ms.

mportant Jen

na.model: keterfaces for mna.mem: contodel state in mna.common: c

– Java APIs for

writing RDF inAnother RDF

persistent sty engine

Jena API imp

as been desimodules such

na packages a

ey package model, resourtains an impmain memorycontains impl

RDF, Anisoara Sa

n RDF/XML, F Parser is als

torage and

plementation

igned to perh as parsers,

are:

for applicatce… lementation y lementation c

ava, Marcela Da

N3 and N‐To accessible

n architecture

rmit the easyserializers, st

ion develop

of Jena API w

classes

aniela Mihai

riples (the Ras a standalo

e

y integration tores and que

er; it conta

which stores

RDF one

of ery

ins

all

5 Com

The API itmanipulati

Mo Sta Re Pro Ob Lit Co

We preseninterfaces

mparative Surve

self consists ing RDF state

odel: a set o satement: a tresource: subjeoperty: “itembject: may be eral: non‐nesontainer: spec

nt below an is highlighted

ey – Java APIs for

of a collectiments:

statements iple of {R, P, Oect, URI m” of resourcea resource osted “object”cial resource,

Figure 2 Jen

example of d:

r RDF, Anisoara

ion of Java i

O}

e r a literal

collection of

na API struct

RDF schema

Sava, Marcela D

nterfaces for

things

ure

a, where Jen

Daniela Mihai

r accessing a

na’s concept

and

of

6 Com

2.1 The

Jena API cacalled a mo

The sample

//some defstatic Strinstatic Strinstatic Strinstatic Strin//create anModel mod//create thResource t//add somtutorial.ad

parative Survey

Fig

Model

an be used toodel and is re

e code for cre

finitions ng tutorialURIng author = “Sng title = “Comng date = “30/n empty grapdel = new Mohe resource utorial = mode properties ddProperty(DC

– Java APIs for

gure 3 Jena in

o create and mepresented by

eating a mode

I = “http://hosSava Anisoaramparative sur/11/2009”; h odelMem();

del.createReso

C.creator, aut

RDF, Anisoara Sa

nterfaces ‐ ex

manipulate Ry the Model i

el is very sim

stname/rdf/ta”; rvey of Java A

ource(tutoria

thor);

ava, Marcela Da

xample

DF graphs. Innterface.

ple:

tutorial/”;

APIs for RDF”;

alURI);

aniela Mihai

n Jena a graphh is


tutorial.addProperty(DC.title, title); tutorial.addProperty(DC.date, date);

The ModelMem class creates an RDF model in memory.

The ModelRDB it is also a model implementation used to manipulate RDF persisted within a relational database such as MySQL or Oracle. The RDF data persisted with ModelRDB can be accessed later, because this model opens and maintains a connection to a relational database. Also, using this model, we can specify how to store the RDF model within the relational database – as a flat table of statements, as a hash or through stored procedures.

2.2 Creating and serializing an RDF model

We can create an RDF model very simple only by creating resources, add a couple of properties and then serialized them (as shown before).

However, there is a problem with this approach, of creating the Property and Resource objects directly in the application that builds the model: we have to duplicate this functionality across all applications that want to use this vocabulary.

A possible solution is to build a vocabulary object, using a Java wrapper class.

Some examples of Jena’s wrap are Dublin Core (DC) RDF, VCARD RDF and so on. If we use a wrapper class for the properties and resources of our vocabulary, we have a way of defining them in one spot, an approach that simplifies both implementation and maintenance.

We will now create a vocabulary class for PostCon using the existing Jena’s vocabulary wrapper classes as a template. The PostCon wrapper class contains a set of static strings holding property or resource labels and a set of associated RDF properties (see below).

The PostCon vacabulary wrapper class


package com.burningbird.postcon.vocabulary; import com.hp.hpl.mesa.rdf.jena.common.ErrorHelper; import com.hp.hpl.mesa.rdf.jena.common.PropertyImpl; import com.hp.hpl.mesa.rdf.jena.common.ResourceImpl; import com.hp.hpl.mesa.rdf.jena.model.Model; import com.hp.hpl.mesa.rdf.jena.model.Property; import com.hp.hpl.mesa.rdf.jena.model.Resource; import com.hp.hpl.mesa.rdf.jena.model.RDFException; public class POSTCON extends Object { // URI for vocabulary elements protected static final String uri = "http://burningbird.net/postcon/elements/1.0/"; // Return URI for vocabulary elements public static String getURI( ) { return uri; } // Define the property labels and objects static final String nbio = "bio"; public static Property bio = null; static final String nrelevancy = "relevancy"; public static Property relevancy = null; static final String npresentation = "presentation"; public static Resource presentation = null; static final String nhistory = "history"; public static Property history = null; static final String nmovementtype = "movementType"; public static Property movementtype = null; static final String nreason = "reason"; public static Property reason = null; static final String nstatus = "currentStatus"; public static Property status = null; static final String nrelated = "related"; public static Property related = null;


static final String ntype = "type"; public static Property type = null; static final String nrequires = "requires"; public static Property requires = null; // Instantiate the properties and the resource static { try { // Instantiate the properties

bio = new PropertyImpl(uri, nbio); relevancy = new PropertyImpl(uri, nrelevancy); presentation = new PropertyImpl(uri, npresentation); history = new PropertyImpl(uri, nhistory); related = new PropertyImpl(uri, nrelated); type = new PropertyImpl(uri, ntype); requires = new PropertyImpl(uri, nrequires); movementtype = new PropertyImpl(uri, nmovementtype); reason = new PropertyImpl(uri, nreason); status = new PropertyImpl(uri, nstatus);

} catch (RDFException e) { ErrorHelper.logInternalError("POSTCON", 1, e); } } }

At this point the PostCon vocabulary is ready to use, we should only import the class in our applications.

import com.burningbird.postcon.vocabulary.POSTCON;

We present below an example of using the PostCon wrapper class to add properties to resource:

import com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*;


import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDF extends Object { public static void main (String args[]) {

// Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 =

"http://burningbird.net/articles/monsters2.htm"; String sRelResource2 =

"http://burningbird.net/articles/monsters3.htm"; try {

// Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource) .addProperty(POSTCON.related,

model.createResource(sRelResource1)) .addProperty(POSTCON.related,

model.createResource(sRelResource2)); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out));

} catch (Exception e) { System.out.println("Failed: " + e); }

} }

We can easily observe that using the wrapper class simplified the code considerably.


The generated RDF/XML from the serialized PostCon submodel looks like this:

<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#' xmlns:NS0='http://burningbird.net/postcon/elements/1.0/' > <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <NS0:related rdf:resource='http://burningbird.net/articles/monsters2.htm'/> <NS0:related rdf:resource='http://burningbird.net/articles/monsters3.htm'/> </rdf:Description> </rdf:RDF>

2.2.1 Adding more complex structures

However, many of the RDF’s models use more complex structures, including nesting resource following the RDF node‐edge‐node pattern. We will take a look on how Jena can easily handle more complex RDF model structures.

If we have the following RDF graph,

<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#' xmlns:NS0='http://burningbird.net/postcon/elements/1.0/' xmlns:dc='http://purl.org/dc/elements/1.0/' > <rdf:Description rdf:nodeID='A0'> <dc:creator>Shelley Powers</dc:creator> <dc:publisher>Burningbird</dc:publisher> <dc:title xml:lang='en'>Tale of Two Monsters: Legends</dc:title> </rdf:Description> <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'>


<NS0:related rdf:resource='http://burningbird.net/articles/monsters2.htm'/> <NS0:related rdf:resource='http://burningbird.net/articles/monsters3.htm'/> <NS0:bio rdf:nodeID='A0'/> </rdf:Description> </rdf:RDF>

it is very easy to generate it with Jena:

import com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*; import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDFSecond extends Object { public static void main (String args[]) {

// Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 =

"http://burningbird.net/articles/monsters2.htm"; String sRelResource2 =

"http://burningbird.net/articles/monsters3.htm"; String sType =

"http://burningbird.net/postcon/elements/1.0/Resource"; try {

// Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource)


.addProperty(POSTCON.related, model.createResource(sRelResource1))

.addProperty(POSTCON.related, model.createResource(sRelResource2));

// Create the bio bnode (blank node) resource // and add properties Resource bio = model.createResource( ) .addProperty(DC.creator, "Mihai Marcela") .addProperty(DC.publisher, "Infoiasi") .addProperty(DC.title, model.createLiteral("Tale of Two

Monsters: Legends", "en")); // Attach to main resource article.addProperty(POSTCON.bio, bio); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out));


} }

The bio property is a resource that does not have a specific URI (blank node). Anyway, it is not a literal so we create a new resource for bio, and add several properties to it. As you have seen, these properties are defined in the DC vocabulary, so we use the DC wrapper class to create them.

Jena offers support for creating typed nodes (rdf:type) and containers.

An RDF container is a grouping of related items. It represents a collection of things:

Bag – unordered collection Alt – unordered collection except first element Seq – ordered collection


2.3 Parsing and querying an RDF document

After we store the data in a model, our next concern is how we can query it.

The data stored in a RDF model can be accessed directly using specific API functions or via RDQL – an RDF query language. A very effective way of pulling data from an RDF model (stored in memory or in a relational database) is querying using SQL‐like syntax.

Jena’s RDQL is implemented as an object called Query. This object is passed to a query engine (QueryEngine) and the results are stored in a query result (QueryResult).

As soon as data is retrieved from the RDF/XML we can iterate through it using any number of iterators (NodeIterator – for general RDF nodes, ResIterator, StmtIterator).

Examples of RDF iterating:

Go through statements A statement is often called a triple, because of its three parts. An RDF graph is represented as a set of statements. The statement interface provides accessor methods to subject, predicate and objects of a statement.

//list the statements in a graph StmtIterator iter = model.listStatements(); //print out the predicate, subject and object of each statement while( iter.hasNext() ) { Statement stmt = iter.next(); //get the next statement Resource subject = stmt.getSubject(); //get the subject Property predicate = stmt.getPredicate(); //get the predicate RDFNode object = stmt.getObject(); //get the object System.out.print(subject.toString()); System.out.print(“ ” + predicate.toString() + “ ”); if(object instanceOf Resource) { System.out.print(object.toString()); } else {


//object is a literal System.out.print(“ \” + object.toString() + “\”); } } Instead of listStatements we could also dump out the subjects (listSubjects, ResIterator) or more generally the objects (listObjects, NodeIterator).

Also, instead of listing all statements or all objects, we can fine‐tune the code to list only subjects, statements, or objects matching specific properties, using the property implementations created within the wrapper class.

Navigating a model

Property email = model.createProperty (tutorialURI, “emailAddress”); Resource tutorial = model.getResource(tutorialURI); Resource author = tutorial.getProperty(DC.creator).getResource(); //list all author’s email addresses StmtIterator iter = author.listProperties(email); while(iter.hasNext()) { System.out.println(“ ” + iter.next().getObject().toString()); }

Querying a model

ResIterator iter = model.listSubjectsWithProperty(DC.date, date); while(iter.hasNext()) { System.out.println(iter.next().getProperty(DC.title).toString(); } NodeIterator iter2 = model.listObjectsOfProperty(DC.creator); while(iter2.hasNext()) { System.out.println(iter2.next().getProperty(name).toString()); }


2.4 In memory versus persistent model storage

As we have mentioned before Jena offers also the possibility to persist data to relational database storage. The supported databases are MySQL, PostgreSQL, Interbase and Oracle. Within each database system, Jena also supports differing storage layouts:

Generic: all statements are stored in a single table, and resources and literals are indexed using integer identifiers generated by the database

GenericProc: similar to generic, but data access is through stored procedures

MMGeneric: similar to generic but can store multiple models Hash: similar to generic but uses MD5 hashes to generate the identifiers

MMHash: similar to hash but can store multiple models

For storing a model in a database we must first create the structure to store the data (the tables must be created in an already existing database, which has been formatted).

We present below an example of persisting a model to a database. We will create ourselves the JDBC connection. The model is based on a MySQL database, using the MMGeneric layout (we aren’t using the slightly more efficient hash method – MMHash – because the generic layout is the better one to take if we access the data directly through JDBC rather than through Jena).

import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.rdb.ModelRDB; import com.hp.hpl.mesa.rdf.jena.rdb.DBConnection; import java.sql.*; public class pracRDFThree extends Object { public static void main (String args[]) {


// Pass one RDF document, connection string, String sUri = args[0]; String sConn = args[2]; String sUser = args[3]; String sPass = args[4]; try {

// Load driver class Class.forName("com.mysql.jdbc.Driver").newInstance( ); // Establish connection ‐ replace with your own conn info Connection con = DriverManager.getConnection(sConn,

"user", "pass"); DBConnection dbcon = new DBConnection(con); // Format database ModelRDB.create(dbcon, "MMGeneric", "Mysql"); // Create and read the model ModelRDB model1 = ModelRDB.createModel(dbcon,

"modelOne"); model1.read(sUri);


} }

Once the model is persisted any number of applications can access it.

We will present also how we can access the model stored in a MySQL database and dump all its objects:

import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.rdb.ModelRDB; import com.hp.hpl.mesa.rdf.jena.rdb.DBConnection; import java.sql.*; public class pracRDFFour extends Object { public static void main (String args[]) {


String sConn = args[0]; String sUser = args[1]; String sPass = args[2]; try {

// load driver class Class.forName("com.mysql.jdbc.Driver").newInstance( ); // Establish connection ‐ replace with your own conn info Connection con = DriverManager.getConnection(sConn,

sUser, sPass); DBConnection dbcon = new DBConnection(con); // Open the existing model ModelRDB model1 = ModelRDB.open(dbcon, "modelOne"); // Print out objects in the model using toString NodeIterator iter = model1.listObjects( ); while (iter.hasNext( )) { System.out.println(" " + iter.next( ).toString( ));

} } catch (Exception e) { System.out.println("Failed: " + e); }

} }

In addition to accessing the data through the Jena API, you can also access it directly using whatever database connectivity you prefer.

As you can see the Jena API for managing RDF databases is very straightforward and easy to use.

2.5 Jena query mechanism evaluation

Some tests were done for exploring the benefits of semantic web technologies RDF, RDF Schema over existing such as XML, XML Schema in the context of a scientific application. It was suggested to replace the Chemistry


Markup Language (CML, well‐established XML format for exchanging information about molecules, reactions and experiment s) with RDF.

The specific technical goal of this study was to evaluate whether Jena would be suitable for storing large numbers of molecules, and whether the RDQL query functionality would support searches for substructures on the scale that would be required in a realistic application.

The ability of RDF to support graph‐based query was identified as a potential benefit in this application, allowing chemists to search a repository of molecules for molecular sub‐structures.

But, queries for simple molecular substructures (such as N‐C‐N) within modest repositories (100 molecules, 26118 statements) took a prohibitively large amount of time to complete (>24 hours). So, unless efficiency of graph query engines is improved, RDF technologies remain inadequate for data‐intensive applications. (See more details on: www.w3c.rl.ac.uk/SWAD/papers/RDFMolecules_final.doc ).

Undoubtedly, RDF offers powerful and flexible alternatives for modelling complex data structures, and its application in this area deserves further study.


3 SESAME

3.1 Introduction

Sesame is an open source framework for storage, inferencing and querying of RDF data. The community site for information about development with Seseme is http://openrdf.org. At beginning it was developed by the Dutch company Aduna as a research prototype for the EU research project On‐To‐Knowledge. In the present, Sesame framework is maintained and developed by Arduna in collaboration with NLnet Foundation and a number of volunteer developers who contribute with ideas, bug reports and fixes.

Sesame has an excellent administration interface included in the distribution, it's easy to install and is consider one of the leading graph stores and running with excellence performance.

The most important characteristics of Sesame 2.x are:

completely targeted at Java 5; all APIs use Java 5 features such as typed collections and iterators.

support for the SPARQL Query Language support for context/provenance, allowing to keep track of individual RDF data units (like files, for instance).

proper transaction/rollback support.

3.2 Licence

If the version 1.x of Sesame was available under the terms of GNU Lesser General Public License (LGPL), version 2.1, the version 2.x is available under a BSD‐style license. This versions includes also software developed by the Apache Software Foundation.


3.3 Architecture

Sesame have an architecture that allows persistent storage of RDF data and schema information and subsequent querying of that information.

For keeping Sesame DBMS‐independent, all DBMS‐specific code is concentrated in a single arhitectural layer: the Storage And Inference Layer (SAIL).

This SAIL is an application programming interface (API) that offers RDF specific methods to its clients and translates these methods to calls to its specific

DBMS. An important advantage of the introduction of such a separate layer

is that it makes it possible to implement Sesame on top of a wide variety of repositories without changing any of Sesame's other components.


The functional modules in Sesame are clients of the SAIL API. Currently, there are three such modules: The RQL query engine, the RDF admin module and the RDF export module.

To have direct access to functional modules from Sesame, must be use Access APIs. Those can be use also for access the next component of Sesame’s architecture, The Sesame Server. This is a component that provides HTTP‐ and RMI‐based access to Sesame's APIs.

The Model API is represented by The Model interface, which holds a collection of indexed statements and implements Set<Statement>. Model, unlike the MemoryStore, does preserve the statement ordering, but unlike the MemoryStore it does not provide concurrent modification, transaction, or query support. The Model is a welcome addition for programs that simply need to read and write RDF files.

For having access to the RFD parsers, writers and basic model types such URI, Literal,

Statement and Model from Sesame, we have only to include sesame‐client.jar.

The next code is for demonstrate how can work with RDF files: read an RDF file, validate it and after that write it back in an organized format:

File file = new File(“somefile.rdf”); StatementCollector collector = new StatementCollector(); InputStream in = new FileInputStream(file); try { RDFParser parser = new RDFXMLParser(); parser.setRDFHandler(collector); parser.parse(in, file.toURI().toString()); } finally { in.close(); } Model model = collector.getModel(); // perform some validation for (Value obj : model.filter(null, RDFS.SUBCLASSOF, null).objects()) { if (!model.contains((Resource) obj, RDF.TYPE, RDFS.CLASS)) throw new Exception(“missing super class”); } // sort the statements and write them back out


model = new ModelOrganizer(model).organize(); OutputStream out = new FileOutputStream(file); try { RDFWriter writer = new RDFXMLPrettyWriter(out); writer.startRDF(); for (String prefix : model.getNamespaces().keySet()) { writer.handleNamespace(prefix, model.getNamespace(prefix)); } for (Statement st : model) { writer.handleStatement(st); } writer.endRDF(); } finally { in.close(); }

3.4 The Repository API

The Repository API is the central access point for Sesame repositories. It can be used to query and update the contents of both local and remote repositories. The Repository API handles all the details of client‐server communication, allowing you to handle remote repositories as easily as local ones.

The main interfaces for the repository API can be found in package org.openrdf.sesame.repository. The implementations of these interface for local and remote repositories can be found in subpackages of this package.

3.5 Transactions

Sesame 3.0 introduces isolation levels to the API. In 2.x each store used its own fixed isolation level (MemoryStore, NativeStore, and RdbmsStore). In Sesame 3.0, each connection can choose its own isolation level. These are


similar to the ANSI SQL isolation levels, also Sesame has created its own RDF‐specific definitions and provides a few more isolation levels that are relevant to distributed and optimistic stores.

Unlike the ANSI SQL isolation levels, Sesame distinguishes between snapshot and serializable isolation levels. Furthermore, Sesame's definition of snapshot isolation allows independent RDF stores to use eventual consistency to manage the changes to the store. This makes clusters of RDF stores easier to implement for connections that allow eventual consistency, but still require a consistent non‐changing view of the store. Some of Sesame's lower levels of isolation explicitly permit the use of HTTP caching (with some restrictions). This allows connections that only need weak consistency to operate at a significantly enhanced speed, by not having to check for new modifications every time.

Like future work for this framework can be consider:

the transaction rollback support . While the SAIL API has support for transactions, it currently has no transaction rollback feature. Transaction rollbacks, especially in the case of uploading information, are crucial if we wish to guarantee database consistency.

Versioning support. The current version of Sesame has no support for versioning. The basic type of versioning will enable more elaborate versioning schemes.

DAML+OIL support.

3.6 Query Language

Sesame currently supports two query languages: SeRQL and SPARQL.

SeRQL ("Sesame RDF Query Language", pronounced "circle") is a new RDF/RDFS query language that is currently being developed by Aduna as part of Sesame. It combines the best features of other (query) languages (RQL,


RDQL, N‐Triples, N3) and adds some of its own. This document briefly shows all of these features.

Some of SeRQL's most important features are:

Graph transformation. RDF Schema support. XML Schema datatype support. Expressive path expression syntax. Optional path matching.

4 Jena RDF storage

Jena architecture provides an abstract RDF model to manage an internal graph that store the RDF model. The applications interact with an abstract Model which translates higher‐level operations into low‐level operations with triples stored in an RDF graph.

The Jena database subsystem implements persistence for RDF graphs using a relational database through a JDBC connection.

The first Jena version had a reduce response time because of the uses of a denormalized relational schema. The current version of Jena trades‐off space for time. Both resource URI and simple literal values are stored directly in the statement table. By storing values directly in the statement table it is possible to perform many find operations without a join. Although the size of the statement table is a problem Jena2 provides several options to reduce it, such as, compress namespaces (by defining a prefix and using this prefix like a reference to the namespace), to use long values only once etc..


5 Sesame RDF storage

Sesame is an open source RDF database with support for RDF Schema inferencing and querying information. The progress in the development of Sesame has showed some RDF store features.

The performance of Sesame together with object‐relational DBMS has been proved in several studies [Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. http://wwwis.win.tue.nl/~jbroekst/papers/ISWC02.pdf ]. All agree in the same conclusion: The performance is very low, if the database system creates a table whenever a new class or property is added.

The study realized with PostgreSQL in [Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. http://wwwis.win.tue.nl/~jbroekst/papers/ISWC02.pdf] uses a different approach (similar to Jena) in which all RDF statements are inserted in a single table with three columns: “Subject, Predicate, Object”. In scenarios where the schema changes often, this approach is better than the previous.

It has been proved that we can improve the performance of inferences over big amount of data using an object oriented database (also for Jena and Sesame), because it implements “on demand materialization” and some concepts as “class extent”. One of the most important features of an object oriented database is that accessing objects in the database is done in a transparent manner such that interaction with persistent objects is no different from interacting with in‐memory objects.

6 Conclusions

Jena is the most known RDF triple. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule‐based inference engine.


The documentation is very good and it also comes with some extra tool developed by third parties.

A plus for Jena is that it works very well on Linux, FreeBSD and Windows.

Because a lot of programmers used it, the support for developing using Jena (integration in existing IDE) is very good.

A minus for Jena is that it doesn’t have a web framework and SPARQL interface. On the other hand, Sesame has all these and also comes with a more modern architecture. So, recently the Semantic Web people were in favour of Sesame.

Also Jena and Sesame make use of the advantages of conventional database system. Although, Sesame supports only MySQL and PostgreSQL.

At this moment Sesame seems to be the most modern approch. It comes with a plugin architecture which makes it very modular (in the second version it offers interfaces to the Spring Framework).

A major advantage for using this framework is for the fact that it offers a pretty well documentation for users and developers. Also, because it is used by a lot of programmers, it’s not difficult to find code examples and opinions about certains aspects of the framework. Also, because is implemented in Java language, it offers interoperability for many platforms for programming.

As we said before the most popular aspect of Sesame is the fact that it can be deployed on tomcat such that one can upload an ontology and test queries in the browser.

Another important feature of the Sesame architecture is its abstraction from the details of any particular repository used for the actual storage. This makes it possible to port Sesame to a large variety of different repositories, including relational databases, RDF triple stores, and even remote storage services on the Web.


As Jena, Sesame has a vivid community and a lot of documentation that make us believe that it will be developed over the years and the interest will only increase.

Currently, Jena and Sesame are the two most popular implementations for RDF store. Because there is no RDF API specification accepted by the Java community, programmers use either Jena API or Sesame API to publish, inquire, and reason over RDF triples. Thus the resulted RDF application source code is tightly coupled with either Jena or Sesame API. The interoperability between Jena and Sesame arises as a problem when a Jena API based application needs to access a Sesame backend, or a Sesame API based application needs to access a Jena backend. This problem is partly solved by Sesame‐Jena Adapter, which provides access to a Jena model through the Sesame SAIL API.

Jena and Sesame are distributed under a BSD license, that allows to the developers to modify and distribute packages and sources.

In our opinion, Jena is a very strong and mature tool, which is a little bit slower that Sesame, because it respects ad literam the SPARQL specifications. That why the developers prefer more and more Sesame API, which comes with a modern approch, with some more speed performances but a very low scalability.

So, it’s developers choice what they can cut in their applications, the speed and specifications or the scalability...

Another appreciated semantic store is Mulgara, which is an open source,massively scalable, transaction‐safe, purpose‐built database, for the storage and retrieval of RDF, written in Java.

Bibliography

An introduction to RDF and Jena RDF API http://jena.sourceforge.net/tutorial/RDF_API/


Joe Verzulli, Using the Jena API to process RDF http://www.xml.com/pub/a/2001/05/23/jena.html

Meet Jena, a semantic web platform for Java http://www.devx.com/semantic/Article/34968

RDF with Jena http://www.docstoc.com/docs/13613350/RDF‐with‐Jena

Shelley Powers, Practical RDF, O’Reilly Brian McBride, Jena: Implementing the RDF Model and Syntax Specification http://sunsite.informatik.rwth‐aachen.de/Publications/CEUR‐WS/Vol‐40/mcbride.pdf

Jesús Soto, Oscar Sanjuan, Luis Joyanes, Semantic Web Servers: A new approach to query on big datasets of metadata

http://sites.google.com/site/wjfang2/jenasesamemodel Create Scalable Semantic Applications with Database‐Backed RDF Stores http://www.devx.com/semantic/Article/35480/0/page/3

http://www.openrdf.org/ http://answers.oreilly.com/topic/447‐how‐to‐use‐the‐sesame‐java‐api‐to‐power‐a‐web‐or‐client‐server‐application/

Education

Comparative Survey-Java APIs For RDF