3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse



The third and last project update on my thesis project titled "Integrating SWI-Prolog for Semantic Reasoning in Bioclipse"

Citation preview

3rd Status report of degree project

Integrating SWI-Prolog for semantic reasoning in Bioclipse

Samuel Lampa, 2010-04-07Project blog: http://saml.rilspace.com

3rd Status report of degree project

Integrating SWI-Prolog for semantic reasoning in Bioclipse

Samuel Lampa, 2010-04-07Project blog: http://saml.rilspace.com

How do biochemical questions

formulated as Prolog queries

compare to other solutions

available in Bioclipse in terms of

speed and expressiveness?

Research questionResearch question

● Jena● General RDF querying (via SPARQL)

● Pellet● OWL-DL Reasoning (via SPARQL)

● General querying via Jena (via SPARQL)

● SWI-Prolog● Access to RDF triples (both assertion and querying) via the

rdf( Subject, Predicate, Object ) method● Complex wrapper/convenience methods can be built

Compared Semantic ToolsCompared Semantic Tools

Interesting use case: Querying NMRShiftDB data● Characteristics:

–Rather shallow RDF graph–Numeric (float value) interval


Use Case: NMRShiftDBUse Case: NMRShiftDB

NMR Spectrum Similarity SearchNMR Spectrum Similarity Search

What to test: Given a spectrum, represented as a list of shift values, find spectra with the same shifts, (allowing variation within a limit).

“→ Dereferencing” spectra




:hasSpectrum <http://pele.farmbio.uu.se/nmrshiftdb/?spectrumId=4735>;

:moleculeId "234".


:hasPeak <http://pele.farmbio.uu.se/nmrshiftdb/?s4735p0>,




:hasShift "17.6"^^xsd:decimal .


:hasShift "18.3"^^xsd:decimal .


:hasShift "22.6"^^xsd:decimal .

Example DataExample Data

% Register RDF namespaces, for use in the convenience methods at the end

:- rdf_register_ns(nmr, 'http://www.nmrshiftdb.org/onto#').

:- rdf_register_ns(xsd, 'http://www.w3.org/2001/XMLSchema#').

find_mol_with_peak_vals_near( SearchShiftVals, Mols ) :-

% Pick the Mols in 'Mol', that match the pattern:

% list_peak_shifts_of_mol( Mol, MolShiftVals ), contains_list_elems_near( SearchShiftVals, MolShiftVals )

% and collect them in 'Mols'.

setof( Mol,

( list_peak_shifts_of_mol( Mol, MolShiftVals ), % A Mol's shift values are collected

contains_list_elems_near( SearchShiftVals, MolShiftVals ) ), % and compared against the given SearchShiftVals

[Mols|MolTail] ). % In 'Mols', all 'Mol's, for which their shift

% values match the SearchShiftVals, are collected.

% Given a 'Mol', give it's shiftvalues in list form, in 'ListOfPeaks'

list_peak_shifts_of_mol( Mol, ListOfPeaks ) :-

has_spectrum( Mol, Spectrum ),

findall( ShiftVal,

( has_peak( Spectrum, Peak ),

has_shift_val( Peak, ShiftVal ) ),

ListOfPeaks ).

% Compare two lists to see if list2 has near-matches for each of the values in list1

contains_list_elems_near( [ElemHead|ElemTail], List ) :-

member_close_to( ElemHead, List ),

( contains_list_elems_near( ElemTail, List );

ElemTail == [] ).


% Recursive construct: %


% Test first the end criterion:

member_close_to( X, [ Y | Tail ] ) :-

closeTo( X, Y ).

% but if the above doesn't validate, then recursively continue with the tail of List2:

member_close_to( X, [ Y | Tail ] ) :-

member_close_to( X, Tail ).

% Numerical near-match

closeTo( Val1, Val2 ) :-

abs(Val1 - Val2) =< 0.3.


% Convenience accessory methods %


has_shift_val( Peak, ShiftVal ) :-

rdf( Peak, nmr:hasShift, literal(type(xsd:decimal, ShiftValLiteral))),

atom_number_create( ShiftValLiteral, ShiftVal ).

has_spectrum( Subject, Predicate ) :-

rdf( Subject, nmr:has_spectrum, Predicate).

has_peak( Subject, Predicate ) :-

rdf( Subject, nmr:has_peak, Predicate).

% Wrapper method for the atom_number/2 method which converts atoms (string constants) to number.

% The wrapper methods avoids exceptions on empty atoms, instead converting into a zero.

atom_number_create( Atom, Number ) :-

atom_length( Atom, AtomLength ), AtomLength > 0 -> % IF atom is not empty

atom_number( Atom, Number ); % THEN Convert the atom to a numerical value

atom_number( '0', Number ). % ELSE Convert to a zero ");

Prolog code

PREFIX owl: <http://www.w3.org/2002/07/owl#>

PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>

PREFIX fn: <http://www.w3.org/2005/xpath-functions#>

PREFIX nmr: <http://www.nmrshiftdb.org/onto#>

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>



?s nmr:hasPeak [ nmr:hasShift ?s1 ] ,

[ nmr:hasShift ?s2 ] ,

[ nmr:hasShift ?s3 ] ,

[ nmr:hasShift ?s4 ] ,

[ nmr:hasShift ?s5 ] ,

[ nmr:hasShift ?s6 ] ,

[ nmr:hasShift ?s7 ] ,

[ nmr:hasShift ?s8 ] ,

[ nmr:hasShift ?s9 ] ,

[ nmr:hasShift ?s10 ] ,

[ nmr:hasShift ?s11 ] ,

[ nmr:hasShift ?s12 ] ,

[ nmr:hasShift ?s13 ] ,

[ nmr:hasShift ?s14 ] ,

[ nmr:hasShift ?s15 ] ,

[ nmr:hasShift ?s16 ] .

FILTER ( fn:abs(?s1 - 17.6) < 0.3 ) .

FILTER ( fn:abs(?s2 - 18.3) < 0.3 ) .

FILTER ( fn:abs(?s3 - 22.6) < 0.3 ) .

FILTER ( fn:abs(?s4 - 26.5) < 0.3 ) .

FILTER ( fn:abs(?s5 - 31.7) < 0.3 ) .

FILTER ( fn:abs(?s6 - 33.5) < 0.3 ) .

FILTER ( fn:abs(?s7 - 33.5) < 0.3 ) .

FILTER ( fn:abs(?s8 - 41.8) < 0.3 ) .

FILTER ( fn:abs(?s9 - 42.0) < 0.3 ) .

FILTER ( fn:abs(?s10 - 42.2) < 0.3 ) .

FILTER ( fn:abs(?s11 - 78.34) < 0.3 ) .

FILTER ( fn:abs(?s12 - 140.99) < 0.3 ) .

FILTER ( fn:abs(?s13 - 158.3) < 0.3 ) .

FILTER ( fn:abs(?s14 - 193.4) < 0.3 ) .

FILTER ( fn:abs(?s15 - 203.0) < 0.3 ) .

FILTER ( fn:abs(?s16 - 0) < 0.3 ) . }



““Expressivity”: SPARQL vs PrologExpressivity”: SPARQL vs Prolog


Prolog predicate taking variablesProlog predicate taking variables

How to change “input parameters”?● SPARQL: Modify SPARQL query● Prolog: Change input parameter

● SPARQL● Fewer lines of code● Easier to understand the code

● Prolog● Easier to change input parameters● Easier to re-use existing logic

(call a method rather than cut and paste SPARQL code)

● Easier to change aspects of the execution logic



Prolog vs Jena vs JenaTDB vs PelletProlog vs Jena vs JenaTDB vs Pellet

Prolog vs Jena vs JenaTDBProlog vs Jena vs JenaTDB

● Prolog is the fastest (in-memory only)● Jena faster with disk based than with

in-memory RDF store!● Pellet with in-memory store is slow● Pellet with disk based store out of



Project plan from lastProject plan from last

Planned final presentation: 28 april 2010 (BMC B7:101a)Everybody is welcome!

Thank you!

Project blog: http://saml.rilspace.com

Thank you!

Project blog: http://saml.rilspace.com
