of 18/18
3 rd Status report of degree project Integrating SWI-Prolog for semantic reasoning in Bioclipse Samuel Lampa, 2010-04-07 Project blog: http://saml.rilspace.com 3 rd Status report of degree project Integrating SWI-Prolog for semantic reasoning in Bioclipse Samuel Lampa, 2010-04-07 Project blog: http://saml.rilspace.com

3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

  • View
    2.952

  • Download
    1

Embed Size (px)

DESCRIPTION

The third and last project update on my thesis project titled "Integrating SWI-Prolog for Semantic Reasoning in Bioclipse"

Text of 3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

  • 1. rd 3 Status report of degree project Integrating SWI-Prolog for semantic reasoning in Bioclipse Samuel Lampa, 2010-04-07Project blog: http://saml.rilspace.com

2. Research questionHow do biochemical questionsformulated as Prolog queries compare to other solutions available in Bioclipse in terms of speed and expressiveness? 3. Compared Semantic Tools Jena General RDF querying (via SPARQL) Pellet OWL-DL Reasoning (via SPARQL) General querying via Jena (via SPARQL) SWI-Prolog Access to RDF triples (both assertion and querying) via the rdf( Subject, Predicate, Object ) method Complex wrapper/convenience methods can be built 4. Use Case: NMRShiftDBInteresting use case: Querying NMRShiftDB data Characteristics: Rather shallow RDF graph Numeric (float value) intervalmatching 5. NMR Spectrum Similarity SearchWhat to test: Given a spectrum, represented as a list of shift values, find spectra with the same shifts, (allowing Intensity variation within a limit).Shift Dereferencing spectra 6. Example Data:hasSpectrum ; :moleculeId "234". :hasPeak ,,, :hasShift "17.6"^^xsd:decimal . :hasShift "18.3"^^xsd:decimal . :hasShift "22.6"^^xsd:decimal . 7. % Register RDF namespaces, for use in the convenience methods at the end :- rdf_register_ns(nmr, 'http://www.nmrshiftdb.org/onto#'). Prolog code :- rdf_register_ns(xsd, 'http://www.w3.org/2001/XMLSchema#').find_mol_with_peak_vals_near( SearchShiftVals, Mols ) :- % Pick the Mols in 'Mol', that match the pattern: % list_peak_shifts_of_mol( Mol, MolShiftVals ), contains_list_elems_near( SearchShiftVals, MolShiftVals ) % and collect them in 'Mols'. setof( Mol,( list_peak_shifts_of_mol( Mol, MolShiftVals ),% A Mol's shift values are collectedcontains_list_elems_near( SearchShiftVals, MolShiftVals ) ), % and compared against the given SearchShiftVals[Mols|MolTail] ).% In 'Mols', all 'Mol's, for which their shift % values match the SearchShiftVals, are collected. % Given a 'Mol', give it's shiftvalues in list form, in 'ListOfPeaks' list_peak_shifts_of_mol( Mol, ListOfPeaks ) :- has_spectrum( Mol, Spectrum ), findall( ShiftVal,( has_peak( Spectrum, Peak ),has_shift_val( Peak, ShiftVal ) ),ListOfPeaks ). % Compare two lists to see if list2 has near-matches for each of the values in list1 contains_list_elems_near( [ElemHead|ElemTail], List ) :- member_close_to( ElemHead, List ), ( contains_list_elems_near( ElemTail, List ); ElemTail == [] ). %%%%%%%%%%%%%%%%%%%%%%%% % Recursive construct: % %%%%%%%%%%%%%%%%%%%%%%%% % Test first the end criterion: member_close_to( X, [ Y | Tail ] ) :- closeTo( X, Y ). % but if the above doesn't validate, then recursively continue with the tail of List2: member_close_to( X, [ Y | Tail ] ) :- member_close_to( X, Tail ). % Numerical near-match closeTo( Val1, Val2 ) :- abs(Val1 - Val2) =< 0.3. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Convenience accessory methods % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% has_shift_val( Peak, ShiftVal ) :- rdf( Peak, nmr:hasShift, literal(type(xsd:decimal, ShiftValLiteral))), atom_number_create( ShiftValLiteral, ShiftVal ). has_spectrum( Subject, Predicate ) :- rdf( Subject, nmr:has_spectrum, Predicate). has_peak( Subject, Predicate ) :- rdf( Subject, nmr:has_peak, Predicate).% Wrapper method for the atom_number/2 method which converts atoms (string constants) to number. % The wrapper methods avoids exceptions on empty atoms, instead converting into a zero. atom_number_create( Atom, Number ) :- atom_length( Atom, AtomLength ), AtomLength > 0 -> % IF atom is not empty atom_number( Atom, Number ); % THEN Convert the atom to a numerical value atom_number( '0', Number ).% ELSE Convert to a zero "); 8. PREFIX owl: SPARQL code PREFIX afn: PREFIX fn: PREFIX nmr: PREFIX xsd: PREFIX rdfs: SELECT?s WHERE {?s nmr:hasPeak [ nmr:hasShift ?s1 ] , [ nmr:hasShift ?s2 ] , [ nmr:hasShift ?s3 ] , [ nmr:hasShift ?s4 ] , [ nmr:hasShift ?s5 ] , [ nmr:hasShift ?s6 ] , [ nmr:hasShift ?s7 ] , [ nmr:hasShift ?s8 ] , [ nmr:hasShift ?s9 ] , [ nmr:hasShift ?s10 ] , [ nmr:hasShift ?s11 ] , [ nmr:hasShift ?s12 ] , [ nmr:hasShift ?s13 ] , [ nmr:hasShift ?s14 ] , [ nmr:hasShift ?s15 ] , [ nmr:hasShift ?s16 ] . FILTER ( fn:abs(?s1 - 17.6) < 0.3 ) . FILTER ( fn:abs(?s2 - 18.3) < 0.3 ) . FILTER ( fn:abs(?s3 - 22.6) < 0.3 ) . FILTER ( fn:abs(?s4 - 26.5) < 0.3 ) . FILTER ( fn:abs(?s5 - 31.7) < 0.3 ) . FILTER ( fn:abs(?s6 - 33.5) < 0.3 ) . FILTER ( fn:abs(?s7 - 33.5) < 0.3 ) . FILTER ( fn:abs(?s8 - 41.8) < 0.3 ) . FILTER ( fn:abs(?s9 - 42.0) < 0.3 ) . FILTER ( fn:abs(?s10 - 42.2) < 0.3 ) . FILTER ( fn:abs(?s11 - 78.34) < 0.3 ) . FILTER ( fn:abs(?s12 - 140.99) < 0.3 ) . FILTER ( fn:abs(?s13 - 158.3) < 0.3 ) . FILTER ( fn:abs(?s14 - 193.4) < 0.3 ) . FILTER ( fn:abs(?s15 - 203.0) < 0.3 ) . FILTER ( fn:abs(?s16 - 0) < 0.3 ) . } 9. Expressiveness 10. Expressivity: SPARQL vs PrologSPARQLPROLOG 11. Prolog predicate taking variablesHow to change input parameters? SPARQL: Modify SPARQL query Prolog: Change input parameter 12. Observations SPARQL Fewer lines of code Easier to understand the code Prolog Easier to change input parameters Easier to re-use existing logic(call a method rather than cut and paste SPARQL code) Easier to change aspects of the execution logic 13. Performance 14. Prolog vs Jena vs JenaTDB vs Pellet 15. Prolog vs Jena vs JenaTDB 16. Observations Prolog is the fastest (in-memory only) Jena faster with disk based than within-memory RDF store! Pellet with in-memory store is slow Pellet with disk based store out ofquestion 17. Project plan from last Planned final presentation: 28 april 2010 (BMC B7:101a)Everybody is welcome! 18. Thank you! Project blog: http://saml.rilspace.com