Upload
anisoarasava
View
3.771
Download
1
Tags:
Embed Size (px)
Citation preview
2009
Comparative survey ‐ Java APIs for RDF
Anisoara Sava, [email protected], OC2
Marcela Daniela Mihai, [email protected], OC2
2 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
Table of Contents
1 Introduction .............................................................................................. 3
2 Jena ........................................................................................................... 3
2.1 The Model ......................................................................................... 6
2.2 Creating and serializing an RDF model ............................................. 7
2.2.1 Adding more complex structures ............................................ 11
2.3 Parsing and querying an RDF document ......................................... 14
2.4 In memory versus persistent model storage .................................. 16
2.5 Jena query mechanism evaluation ................................................. 18
3 SESAME ................................................................................................... 20
3.1 Introduction .................................................................................... 20
3.2 Licence ............................................................................................ 20
3.3 Architecture .................................................................................... 21
3.4 The Repository API .......................................................................... 23
3.5 Transactions .................................................................................... 23
3.6 Query Language .............................................................................. 24
4 Jena RDF storage ..................................................................................... 25
5 Sesame RDF storage ............................................................................... 26
6 Conclusions ............................................................................................. 26
Bibliography .................................................................................................... 28
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
1 Introduction
The Semantic Web is essentially a metadata management system. We may use metadata to describe everything: entities have sets of properties and relationships to various other classes. Languages such as RDF, RDFS and DAML provide markup tags that can be used to express (or assert) these relationships.
At the moment there are a lot of tools for creating and manipulating large scale ontologies. These tools can be classified in two sets: turnkey products and API/library implementations. Turnkey solutions, such as OntoMat and Protégé, can be used immediately to visually author metadata. But we may want some general API and libraries that we can use to construct our own applications. Jena and Sesame are such APIs for Java.
2 Jena
Java programmers who want to develop semantic web applications have a growing range of tools and libraries to choose from. One such tool, the Jena platform, is an open source API and toolkit for processing Resource Description Framework (RDF), Web Ontology Language (OWL) and other semantic web data.
Jena is accessible at Source Forge (http://sourceforge.net/projects/jena) or at http://www.hpl.hp.com/semweb/jena.htm. In addition, there’s a Jena developers’ discussion forum at http://groups.yahoo.com/group/jena‐dev/ .
The Jena platform has been developed by Hewlett‐Packard’s Semantic Web team. In fact, the cochair of the RDF Working Group is Brian McBride, one of the creators of Jena. Jena is derived from SiRPAC API. It can parse, create and search RDF models.
The Jena Framework includes:
a RDF API
4 Com
reapapro
an in‐ a S
The implealternativeprocessors
The most i
jenint
jenmo
jen
parative Survey
ading and wrser – ARP – oduct) OWL API ‐memory andSPARQL query
Figure 1
mentation hae processing ms.
mportant Jen
na.model: keterfaces for mna.mem: contodel state in mna.common: c
– Java APIs for
writing RDF inAnother RDF
persistent sty engine
Jena API imp
as been desimodules such
na packages a
ey package model, resourtains an impmain memorycontains impl
RDF, Anisoara Sa
n RDF/XML, F Parser is als
torage and
plementation
igned to perh as parsers,
are:
for applicatce… lementation y lementation c
ava, Marcela Da
N3 and N‐To accessible
n architecture
rmit the easyserializers, st
ion develop
of Jena API w
classes
aniela Mihai
riples (the Ras a standalo
e
y integration tores and que
er; it conta
which stores
RDF one
of ery
ins
all
5 Com
The API itmanipulati
Mo Sta Re Pro Ob Lit Co
We preseninterfaces
mparative Surve
self consists ing RDF state
odel: a set o satement: a tresource: subjeoperty: “itembject: may be eral: non‐nesontainer: spec
nt below an is highlighted
ey – Java APIs for
of a collectiments:
statements iple of {R, P, Oect, URI m” of resourcea resource osted “object”cial resource,
Figure 2 Jen
example of d:
r RDF, Anisoara
ion of Java i
O}
e r a literal
collection of
na API struct
RDF schema
Sava, Marcela D
nterfaces for
things
ure
a, where Jen
Daniela Mihai
r accessing a
na’s concept
and
of
6 Com
2.1 The
Jena API cacalled a mo
The sample
//some defstatic Strinstatic Strinstatic Strinstatic Strin//create anModel mod//create thResource t//add somtutorial.ad
parative Survey
Fig
Model
an be used toodel and is re
e code for cre
finitions ng tutorialURIng author = “Sng title = “Comng date = “30/n empty grapdel = new Mohe resource utorial = mode properties ddProperty(DC
– Java APIs for
gure 3 Jena in
o create and mepresented by
eating a mode
I = “http://hosSava Anisoaramparative sur/11/2009”; h odelMem();
del.createReso
C.creator, aut
RDF, Anisoara Sa
nterfaces ‐ ex
manipulate Ry the Model i
el is very sim
stname/rdf/ta”; rvey of Java A
ource(tutoria
thor);
ava, Marcela Da
xample
DF graphs. Innterface.
ple:
tutorial/”;
APIs for RDF”;
alURI);
aniela Mihai
n Jena a graphh is
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
tutorial.addProperty(DC.title, title); tutorial.addProperty(DC.date, date);
The ModelMem class creates an RDF model in memory.
The ModelRDB it is also a model implementation used to manipulate RDF persisted within a relational database such as MySQL or Oracle. The RDF data persisted with ModelRDB can be accessed later, because this model opens and maintains a connection to a relational database. Also, using this model, we can specify how to store the RDF model within the relational database – as a flat table of statements, as a hash or through stored procedures.
2.2 Creating and serializing an RDF model
We can create an RDF model very simple only by creating resources, add a couple of properties and then serialized them (as shown before).
However, there is a problem with this approach, of creating the Property and Resource objects directly in the application that builds the model: we have to duplicate this functionality across all applications that want to use this vocabulary.
A possible solution is to build a vocabulary object, using a Java wrapper class.
Some examples of Jena’s wrap are Dublin Core (DC) RDF, VCARD RDF and so on. If we use a wrapper class for the properties and resources of our vocabulary, we have a way of defining them in one spot, an approach that simplifies both implementation and maintenance.
We will now create a vocabulary class for PostCon using the existing Jena’s vocabulary wrapper classes as a template. The PostCon wrapper class contains a set of static strings holding property or resource labels and a set of associated RDF properties (see below).
The PostCon vacabulary wrapper class
8 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
package com.burningbird.postcon.vocabulary; import com.hp.hpl.mesa.rdf.jena.common.ErrorHelper; import com.hp.hpl.mesa.rdf.jena.common.PropertyImpl; import com.hp.hpl.mesa.rdf.jena.common.ResourceImpl; import com.hp.hpl.mesa.rdf.jena.model.Model; import com.hp.hpl.mesa.rdf.jena.model.Property; import com.hp.hpl.mesa.rdf.jena.model.Resource; import com.hp.hpl.mesa.rdf.jena.model.RDFException; public class POSTCON extends Object { // URI for vocabulary elements protected static final String uri = "http://burningbird.net/postcon/elements/1.0/"; // Return URI for vocabulary elements public static String getURI( ) { return uri; } // Define the property labels and objects static final String nbio = "bio"; public static Property bio = null; static final String nrelevancy = "relevancy"; public static Property relevancy = null; static final String npresentation = "presentation"; public static Resource presentation = null; static final String nhistory = "history"; public static Property history = null; static final String nmovementtype = "movementType"; public static Property movementtype = null; static final String nreason = "reason"; public static Property reason = null; static final String nstatus = "currentStatus"; public static Property status = null; static final String nrelated = "related"; public static Property related = null;
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
static final String ntype = "type"; public static Property type = null; static final String nrequires = "requires"; public static Property requires = null; // Instantiate the properties and the resource static { try { // Instantiate the properties
bio = new PropertyImpl(uri, nbio); relevancy = new PropertyImpl(uri, nrelevancy); presentation = new PropertyImpl(uri, npresentation); history = new PropertyImpl(uri, nhistory); related = new PropertyImpl(uri, nrelated); type = new PropertyImpl(uri, ntype); requires = new PropertyImpl(uri, nrequires); movementtype = new PropertyImpl(uri, nmovementtype); reason = new PropertyImpl(uri, nreason); status = new PropertyImpl(uri, nstatus);
} catch (RDFException e) { ErrorHelper.logInternalError("POSTCON", 1, e); } } }
At this point the PostCon vocabulary is ready to use, we should only import the class in our applications.
import com.burningbird.postcon.vocabulary.POSTCON;
We present below an example of using the PostCon wrapper class to add properties to resource:
import com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*;
10 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDF extends Object { public static void main (String args[]) {
// Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 =
"http://burningbird.net/articles/monsters2.htm"; String sRelResource2 =
"http://burningbird.net/articles/monsters3.htm"; try {
// Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource) .addProperty(POSTCON.related,
model.createResource(sRelResource1)) .addProperty(POSTCON.related,
model.createResource(sRelResource2)); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out));
} catch (Exception e) { System.out.println("Failed: " + e); }
} }
We can easily observe that using the wrapper class simplified the code considerably.
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
The generated RDF/XML from the serialized PostCon submodel looks like this:
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#' xmlns:NS0='http://burningbird.net/postcon/elements/1.0/' > <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'> <NS0:related rdf:resource='http://burningbird.net/articles/monsters2.htm'/> <NS0:related rdf:resource='http://burningbird.net/articles/monsters3.htm'/> </rdf:Description> </rdf:RDF>
2.2.1 Adding more complex structures
However, many of the RDF’s models use more complex structures, including nesting resource following the RDF node‐edge‐node pattern. We will take a look on how Jena can easily handle more complex RDF model structures.
If we have the following RDF graph,
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#' xmlns:NS0='http://burningbird.net/postcon/elements/1.0/' xmlns:dc='http://purl.org/dc/elements/1.0/' > <rdf:Description rdf:nodeID='A0'> <dc:creator>Shelley Powers</dc:creator> <dc:publisher>Burningbird</dc:publisher> <dc:title xml:lang='en'>Tale of Two Monsters: Legends</dc:title> </rdf:Description> <rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'>
12 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
<NS0:related rdf:resource='http://burningbird.net/articles/monsters2.htm'/> <NS0:related rdf:resource='http://burningbird.net/articles/monsters3.htm'/> <NS0:bio rdf:nodeID='A0'/> </rdf:Description> </rdf:RDF>
it is very easy to generate it with Jena:
import com.hp.hpl.mesa.rdf.jena.mem.ModelMem; import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.vocabulary.*; import com.burningbird.postcon.vocabulary.POSTCON; import java.io.FileOutputStream; import java.io.PrintWriter; public class pracRDFSecond extends Object { public static void main (String args[]) {
// Resource names String sResource = "http://burningbird.net/articles/monsters1.htm"; String sRelResource1 =
"http://burningbird.net/articles/monsters2.htm"; String sRelResource2 =
"http://burningbird.net/articles/monsters3.htm"; String sType =
"http://burningbird.net/postcon/elements/1.0/Resource"; try {
// Create an empty graph Model model = new ModelMem( ); // Create the resource // and add the properties cascading style Resource article = model.createResource(sResource)
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
.addProperty(POSTCON.related, model.createResource(sRelResource1))
.addProperty(POSTCON.related, model.createResource(sRelResource2));
// Create the bio bnode (blank node) resource // and add properties Resource bio = model.createResource( ) .addProperty(DC.creator, "Mihai Marcela") .addProperty(DC.publisher, "Infoiasi") .addProperty(DC.title, model.createLiteral("Tale of Two
Monsters: Legends", "en")); // Attach to main resource article.addProperty(POSTCON.bio, bio); // Print RDF/XML of model to system output model.write(new PrintWriter(System.out));
} catch (Exception e) { System.out.println("Failed: " + e); }
} }
The bio property is a resource that does not have a specific URI (blank node). Anyway, it is not a literal so we create a new resource for bio, and add several properties to it. As you have seen, these properties are defined in the DC vocabulary, so we use the DC wrapper class to create them.
Jena offers support for creating typed nodes (rdf:type) and containers.
An RDF container is a grouping of related items. It represents a collection of things:
Bag – unordered collection Alt – unordered collection except first element Seq – ordered collection
14 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
2.3 Parsing and querying an RDF document
After we store the data in a model, our next concern is how we can query it.
The data stored in a RDF model can be accessed directly using specific API functions or via RDQL – an RDF query language. A very effective way of pulling data from an RDF model (stored in memory or in a relational database) is querying using SQL‐like syntax.
Jena’s RDQL is implemented as an object called Query. This object is passed to a query engine (QueryEngine) and the results are stored in a query result (QueryResult).
As soon as data is retrieved from the RDF/XML we can iterate through it using any number of iterators (NodeIterator – for general RDF nodes, ResIterator, StmtIterator).
Examples of RDF iterating:
Go through statements A statement is often called a triple, because of its three parts. An RDF graph is represented as a set of statements. The statement interface provides accessor methods to subject, predicate and objects of a statement.
//list the statements in a graph StmtIterator iter = model.listStatements(); //print out the predicate, subject and object of each statement while( iter.hasNext() ) { Statement stmt = iter.next(); //get the next statement Resource subject = stmt.getSubject(); //get the subject Property predicate = stmt.getPredicate(); //get the predicate RDFNode object = stmt.getObject(); //get the object System.out.print(subject.toString()); System.out.print(“ ” + predicate.toString() + “ ”); if(object instanceOf Resource) { System.out.print(object.toString()); } else {
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
//object is a literal System.out.print(“ \” + object.toString() + “\”); } } Instead of listStatements we could also dump out the subjects (listSubjects, ResIterator) or more generally the objects (listObjects, NodeIterator).
Also, instead of listing all statements or all objects, we can fine‐tune the code to list only subjects, statements, or objects matching specific properties, using the property implementations created within the wrapper class.
Navigating a model
Property email = model.createProperty (tutorialURI, “emailAddress”); Resource tutorial = model.getResource(tutorialURI); Resource author = tutorial.getProperty(DC.creator).getResource(); //list all author’s email addresses StmtIterator iter = author.listProperties(email); while(iter.hasNext()) { System.out.println(“ ” + iter.next().getObject().toString()); }
Querying a model
ResIterator iter = model.listSubjectsWithProperty(DC.date, date); while(iter.hasNext()) { System.out.println(iter.next().getProperty(DC.title).toString(); } NodeIterator iter2 = model.listObjectsOfProperty(DC.creator); while(iter2.hasNext()) { System.out.println(iter2.next().getProperty(name).toString()); }
16 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
2.4 In memory versus persistent model storage
As we have mentioned before Jena offers also the possibility to persist data to relational database storage. The supported databases are MySQL, PostgreSQL, Interbase and Oracle. Within each database system, Jena also supports differing storage layouts:
Generic: all statements are stored in a single table, and resources and literals are indexed using integer identifiers generated by the database
GenericProc: similar to generic, but data access is through stored procedures
MMGeneric: similar to generic but can store multiple models Hash: similar to generic but uses MD5 hashes to generate the identifiers
MMHash: similar to hash but can store multiple models
For storing a model in a database we must first create the structure to store the data (the tables must be created in an already existing database, which has been formatted).
We present below an example of persisting a model to a database. We will create ourselves the JDBC connection. The model is based on a MySQL database, using the MMGeneric layout (we aren’t using the slightly more efficient hash method – MMHash – because the generic layout is the better one to take if we access the data directly through JDBC rather than through Jena).
import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.rdb.ModelRDB; import com.hp.hpl.mesa.rdf.jena.rdb.DBConnection; import java.sql.*; public class pracRDFThree extends Object { public static void main (String args[]) {
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
// Pass one RDF document, connection string, String sUri = args[0]; String sConn = args[2]; String sUser = args[3]; String sPass = args[4]; try {
// Load driver class Class.forName("com.mysql.jdbc.Driver").newInstance( ); // Establish connection ‐ replace with your own conn info Connection con = DriverManager.getConnection(sConn,
"user", "pass"); DBConnection dbcon = new DBConnection(con); // Format database ModelRDB.create(dbcon, "MMGeneric", "Mysql"); // Create and read the model ModelRDB model1 = ModelRDB.createModel(dbcon,
"modelOne"); model1.read(sUri);
} catch (Exception e) { System.out.println("Failed: " + e); }
} }
Once the model is persisted any number of applications can access it.
We will present also how we can access the model stored in a MySQL database and dump all its objects:
import com.hp.hpl.mesa.rdf.jena.model.*; import com.hp.hpl.mesa.rdf.jena.rdb.ModelRDB; import com.hp.hpl.mesa.rdf.jena.rdb.DBConnection; import java.sql.*; public class pracRDFFour extends Object { public static void main (String args[]) {
18 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
String sConn = args[0]; String sUser = args[1]; String sPass = args[2]; try {
// load driver class Class.forName("com.mysql.jdbc.Driver").newInstance( ); // Establish connection ‐ replace with your own conn info Connection con = DriverManager.getConnection(sConn,
sUser, sPass); DBConnection dbcon = new DBConnection(con); // Open the existing model ModelRDB model1 = ModelRDB.open(dbcon, "modelOne"); // Print out objects in the model using toString NodeIterator iter = model1.listObjects( ); while (iter.hasNext( )) { System.out.println(" " + iter.next( ).toString( ));
} } catch (Exception e) { System.out.println("Failed: " + e); }
} }
In addition to accessing the data through the Jena API, you can also access it directly using whatever database connectivity you prefer.
As you can see the Jena API for managing RDF databases is very straightforward and easy to use.
2.5 Jena query mechanism evaluation
Some tests were done for exploring the benefits of semantic web technologies RDF, RDF Schema over existing such as XML, XML Schema in the context of a scientific application. It was suggested to replace the Chemistry
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
Markup Language (CML, well‐established XML format for exchanging information about molecules, reactions and experiment s) with RDF.
The specific technical goal of this study was to evaluate whether Jena would be suitable for storing large numbers of molecules, and whether the RDQL query functionality would support searches for substructures on the scale that would be required in a realistic application.
The ability of RDF to support graph‐based query was identified as a potential benefit in this application, allowing chemists to search a repository of molecules for molecular sub‐structures.
But, queries for simple molecular substructures (such as N‐C‐N) within modest repositories (100 molecules, 26118 statements) took a prohibitively large amount of time to complete (>24 hours). So, unless efficiency of graph query engines is improved, RDF technologies remain inadequate for data‐intensive applications. (See more details on: www.w3c.rl.ac.uk/SWAD/papers/RDFMolecules_final.doc ).
Undoubtedly, RDF offers powerful and flexible alternatives for modelling complex data structures, and its application in this area deserves further study.
20 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
3 SESAME
3.1 Introduction
Sesame is an open source framework for storage, inferencing and querying of RDF data. The community site for information about development with Seseme is http://openrdf.org. At beginning it was developed by the Dutch company Aduna as a research prototype for the EU research project On‐To‐Knowledge. In the present, Sesame framework is maintained and developed by Arduna in collaboration with NLnet Foundation and a number of volunteer developers who contribute with ideas, bug reports and fixes.
Sesame has an excellent administration interface included in the distribution, it's easy to install and is consider one of the leading graph stores and running with excellence performance.
The most important characteristics of Sesame 2.x are:
completely targeted at Java 5; all APIs use Java 5 features such as typed collections and iterators.
support for the SPARQL Query Language support for context/provenance, allowing to keep track of individual RDF data units (like files, for instance).
proper transaction/rollback support.
3.2 Licence
If the version 1.x of Sesame was available under the terms of GNU Lesser General Public License (LGPL), version 2.1, the version 2.x is available under a BSD‐style license. This versions includes also software developed by the Apache Software Foundation.
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
3.3 Architecture
Sesame have an architecture that allows persistent storage of RDF data and schema information and subsequent querying of that information.
For keeping Sesame DBMS‐independent, all DBMS‐specific code is concentrated in a single arhitectural layer: the Storage And Inference Layer (SAIL).
This SAIL is an application programming interface (API) that offers RDF specific methods to its clients and translates these methods to calls to its specific
DBMS. An important advantage of the introduction of such a separate layer
is that it makes it possible to implement Sesame on top of a wide variety of repositories without changing any of Sesame's other components.
22 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
The functional modules in Sesame are clients of the SAIL API. Currently, there are three such modules: The RQL query engine, the RDF admin module and the RDF export module.
To have direct access to functional modules from Sesame, must be use Access APIs. Those can be use also for access the next component of Sesame’s architecture, The Sesame Server. This is a component that provides HTTP‐ and RMI‐based access to Sesame's APIs.
The Model API is represented by The Model interface, which holds a collection of indexed statements and implements Set<Statement>. Model, unlike the MemoryStore, does preserve the statement ordering, but unlike the MemoryStore it does not provide concurrent modification, transaction, or query support. The Model is a welcome addition for programs that simply need to read and write RDF files.
For having access to the RFD parsers, writers and basic model types such URI, Literal,
Statement and Model from Sesame, we have only to include sesame‐client.jar.
The next code is for demonstrate how can work with RDF files: read an RDF file, validate it and after that write it back in an organized format:
File file = new File(“somefile.rdf”); StatementCollector collector = new StatementCollector(); InputStream in = new FileInputStream(file); try { RDFParser parser = new RDFXMLParser(); parser.setRDFHandler(collector); parser.parse(in, file.toURI().toString()); } finally { in.close(); } Model model = collector.getModel(); // perform some validation for (Value obj : model.filter(null, RDFS.SUBCLASSOF, null).objects()) { if (!model.contains((Resource) obj, RDF.TYPE, RDFS.CLASS)) throw new Exception(“missing super class”); } // sort the statements and write them back out
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
model = new ModelOrganizer(model).organize(); OutputStream out = new FileOutputStream(file); try { RDFWriter writer = new RDFXMLPrettyWriter(out); writer.startRDF(); for (String prefix : model.getNamespaces().keySet()) { writer.handleNamespace(prefix, model.getNamespace(prefix)); } for (Statement st : model) { writer.handleStatement(st); } writer.endRDF(); } finally { in.close(); }
3.4 The Repository API
The Repository API is the central access point for Sesame repositories. It can be used to query and update the contents of both local and remote repositories. The Repository API handles all the details of client‐server communication, allowing you to handle remote repositories as easily as local ones.
The main interfaces for the repository API can be found in package org.openrdf.sesame.repository. The implementations of these interface for local and remote repositories can be found in subpackages of this package.
3.5 Transactions
Sesame 3.0 introduces isolation levels to the API. In 2.x each store used its own fixed isolation level (MemoryStore, NativeStore, and RdbmsStore). In Sesame 3.0, each connection can choose its own isolation level. These are
24 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
similar to the ANSI SQL isolation levels, also Sesame has created its own RDF‐specific definitions and provides a few more isolation levels that are relevant to distributed and optimistic stores.
Unlike the ANSI SQL isolation levels, Sesame distinguishes between snapshot and serializable isolation levels. Furthermore, Sesame's definition of snapshot isolation allows independent RDF stores to use eventual consistency to manage the changes to the store. This makes clusters of RDF stores easier to implement for connections that allow eventual consistency, but still require a consistent non‐changing view of the store. Some of Sesame's lower levels of isolation explicitly permit the use of HTTP caching (with some restrictions). This allows connections that only need weak consistency to operate at a significantly enhanced speed, by not having to check for new modifications every time.
Like future work for this framework can be consider:
the transaction rollback support . While the SAIL API has support for transactions, it currently has no transaction rollback feature. Transaction rollbacks, especially in the case of uploading information, are crucial if we wish to guarantee database consistency.
Versioning support. The current version of Sesame has no support for versioning. The basic type of versioning will enable more elaborate versioning schemes.
DAML+OIL support.
3.6 Query Language
Sesame currently supports two query languages: SeRQL and SPARQL.
SeRQL ("Sesame RDF Query Language", pronounced "circle") is a new RDF/RDFS query language that is currently being developed by Aduna as part of Sesame. It combines the best features of other (query) languages (RQL,
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
RDQL, N‐Triples, N3) and adds some of its own. This document briefly shows all of these features.
Some of SeRQL's most important features are:
Graph transformation. RDF Schema support. XML Schema datatype support. Expressive path expression syntax. Optional path matching.
4 Jena RDF storage
Jena architecture provides an abstract RDF model to manage an internal graph that store the RDF model. The applications interact with an abstract Model which translates higher‐level operations into low‐level operations with triples stored in an RDF graph.
The Jena database subsystem implements persistence for RDF graphs using a relational database through a JDBC connection.
The first Jena version had a reduce response time because of the uses of a denormalized relational schema. The current version of Jena trades‐off space for time. Both resource URI and simple literal values are stored directly in the statement table. By storing values directly in the statement table it is possible to perform many find operations without a join. Although the size of the statement table is a problem Jena2 provides several options to reduce it, such as, compress namespaces (by defining a prefix and using this prefix like a reference to the namespace), to use long values only once etc..
26 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
5 Sesame RDF storage
Sesame is an open source RDF database with support for RDF Schema inferencing and querying information. The progress in the development of Sesame has showed some RDF store features.
The performance of Sesame together with object‐relational DBMS has been proved in several studies [Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. http://wwwis.win.tue.nl/~jbroekst/papers/ISWC02.pdf ]. All agree in the same conclusion: The performance is very low, if the database system creates a table whenever a new class or property is added.
The study realized with PostgreSQL in [Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. http://wwwis.win.tue.nl/~jbroekst/papers/ISWC02.pdf] uses a different approach (similar to Jena) in which all RDF statements are inserted in a single table with three columns: “Subject, Predicate, Object”. In scenarios where the schema changes often, this approach is better than the previous.
It has been proved that we can improve the performance of inferences over big amount of data using an object oriented database (also for Jena and Sesame), because it implements “on demand materialization” and some concepts as “class extent”. One of the most important features of an object oriented database is that accessing objects in the database is done in a transparent manner such that interaction with persistent objects is no different from interacting with in‐memory objects.
6 Conclusions
Jena is the most known RDF triple. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule‐based inference engine.
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
The documentation is very good and it also comes with some extra tool developed by third parties.
A plus for Jena is that it works very well on Linux, FreeBSD and Windows.
Because a lot of programmers used it, the support for developing using Jena (integration in existing IDE) is very good.
A minus for Jena is that it doesn’t have a web framework and SPARQL interface. On the other hand, Sesame has all these and also comes with a more modern architecture. So, recently the Semantic Web people were in favour of Sesame.
Also Jena and Sesame make use of the advantages of conventional database system. Although, Sesame supports only MySQL and PostgreSQL.
At this moment Sesame seems to be the most modern approch. It comes with a plugin architecture which makes it very modular (in the second version it offers interfaces to the Spring Framework).
A major advantage for using this framework is for the fact that it offers a pretty well documentation for users and developers. Also, because it is used by a lot of programmers, it’s not difficult to find code examples and opinions about certains aspects of the framework. Also, because is implemented in Java language, it offers interoperability for many platforms for programming.
As we said before the most popular aspect of Sesame is the fact that it can be deployed on tomcat such that one can upload an ontology and test queries in the browser.
Another important feature of the Sesame architecture is its abstraction from the details of any particular repository used for the actual storage. This makes it possible to port Sesame to a large variety of different repositories, including relational databases, RDF triple stores, and even remote storage services on the Web.
28 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
As Jena, Sesame has a vivid community and a lot of documentation that make us believe that it will be developed over the years and the interest will only increase.
Currently, Jena and Sesame are the two most popular implementations for RDF store. Because there is no RDF API specification accepted by the Java community, programmers use either Jena API or Sesame API to publish, inquire, and reason over RDF triples. Thus the resulted RDF application source code is tightly coupled with either Jena or Sesame API. The interoperability between Jena and Sesame arises as a problem when a Jena API based application needs to access a Sesame backend, or a Sesame API based application needs to access a Jena backend. This problem is partly solved by Sesame‐Jena Adapter, which provides access to a Jena model through the Sesame SAIL API.
Jena and Sesame are distributed under a BSD license, that allows to the developers to modify and distribute packages and sources.
In our opinion, Jena is a very strong and mature tool, which is a little bit slower that Sesame, because it respects ad literam the SPARQL specifications. That why the developers prefer more and more Sesame API, which comes with a modern approch, with some more speed performances but a very low scalability.
So, it’s developers choice what they can cut in their applications, the speed and specifications or the scalability...
Another appreciated semantic store is Mulgara, which is an open source,massively scalable, transaction‐safe, purpose‐built database, for the storage and retrieval of RDF, written in Java.
Bibliography
An introduction to RDF and Jena RDF API http://jena.sourceforge.net/tutorial/RDF_API/
5 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai
Joe Verzulli, Using the Jena API to process RDF http://www.xml.com/pub/a/2001/05/23/jena.html
Meet Jena, a semantic web platform for Java http://www.devx.com/semantic/Article/34968
RDF with Jena http://www.docstoc.com/docs/13613350/RDF‐with‐Jena
Shelley Powers, Practical RDF, O’Reilly Brian McBride, Jena: Implementing the RDF Model and Syntax Specification http://sunsite.informatik.rwth‐aachen.de/Publications/CEUR‐WS/Vol‐40/mcbride.pdf
Jesús Soto, Oscar Sanjuan, Luis Joyanes, Semantic Web Servers: A new approach to query on big datasets of metadata
http://sites.google.com/site/wjfang2/jenasesamemodel Create Scalable Semantic Applications with Database‐Backed RDF Stores http://www.devx.com/semantic/Article/35480/0/page/3
http://www.openrdf.org/ http://answers.oreilly.com/topic/447‐how‐to‐use‐the‐sesame‐java‐api‐to‐power‐a‐web‐or‐client‐server‐application/