Upload
logomachy
View
108
Download
0
Embed Size (px)
DESCRIPTION
Ontotext's slide describing the Lucene and ElasticSearch connectors
Citation preview
GraphDB Connectors
Nikola Petrov<[email protected]>
Ontotext AD
September 8, 2014
Nikola Petrov<[email protected]> GraphDB Connectors
OWLIM GraphDB Family
GRAPHDB Server - the database engine
GRAPHDB Workbench - tool support that replaces the sesameworkbench
GRAPHDB Connectors - implements advanced search scenarios
Nikola Petrov<[email protected]> GraphDB Connectors
Why? What we are trying to solve
Our models are really rich/complex but we still want to get theresults fast
Most of the queries in the search space are of the form:
SELECT < p r o j e c t i o n var iab les >WHERE {
{. . . many t r i p l e pa t te rns to f i l t e r data . . .. . . p o t e n t i a l l y regex or f u l l − t e x t search . . .
}. . .{
. . . many t r i p l e pa t te rns to r e t r i e v e data . . .
. . . o f ten many o p t i o n a l and aggregates . . .}
}
We want to optimize the filtering part
Nikola Petrov<[email protected]> GraphDB Connectors
Why? What we are trying to solve
Our models are really rich/complex but we still want to get theresults fast
Most of the queries in the search space are of the form:
SELECT < p r o j e c t i o n var iab les >WHERE {
{. . . many t r i p l e pa t te rns to f i l t e r data . . .. . . p o t e n t i a l l y regex or f u l l − t e x t search . . .
}. . .{
. . . many t r i p l e pa t te rns to r e t r i e v e data . . .
. . . o f ten many o p t i o n a l and aggregates . . .}
}
We want to optimize the filtering part
Nikola Petrov<[email protected]> GraphDB Connectors
Why? What we are trying to solve
Our models are really rich/complex but we still want to get theresults fast
Most of the queries in the search space are of the form:
SELECT < p r o j e c t i o n var iab les >WHERE {
{. . . many t r i p l e pa t te rns to f i l t e r data . . .. . . p o t e n t i a l l y regex or f u l l − t e x t search . . .
}. . .{
. . . many t r i p l e pa t te rns to r e t r i e v e data . . .
. . . o f ten many o p t i o n a l and aggregates . . .}
}
We want to optimize the filtering part
Nikola Petrov<[email protected]> GraphDB Connectors
Design philosophy
Nikola Petrov<[email protected]> GraphDB Connectors
Design philosophy
Nikola Petrov<[email protected]> GraphDB Connectors
How it all started
Analyzed all major production systems projects
Collected feedback from partners
Reused features and know-how from multiple projects
Nikola Petrov<[email protected]> GraphDB Connectors
Main Features
Real time synchronization
Property chains support (data denormalization)
Full-text search and result snippets
Facet and co-occurrence
Visual interface for index creation
Everything is integrated in SPARQL
Nikola Petrov<[email protected]> GraphDB Connectors
Replication model
Nikola Petrov<[email protected]> GraphDB Connectors
Real time synchronization
Create a link with a special predicate and we are going toindex/synchronize resources in the external store
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA{inst:my_index :createConnector ’’’# tell us what resources you want to index and which predicate# values about them. We support predicate chains if the properties of# the resource are not direct’’’
}
When you add/change resources that match the criteria, we willautomatically update the external store
NoteThis might slow your insert statements a little bit(our tests show that this time isnegligible)
Nikola Petrov<[email protected]> GraphDB Connectors
Real time synchronization
Create a link with a special predicate and we are going toindex/synchronize resources in the external store
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA{inst:my_index :createConnector ’’’# tell us what resources you want to index and which predicate# values about them. We support predicate chains if the properties of# the resource are not direct’’’
}
When you add/change resources that match the criteria, we willautomatically update the external store
NoteThis might slow your insert statements a little bit(our tests show that this time isnegligible)
Nikola Petrov<[email protected]> GraphDB Connectors
Real time synchronization
Create a link with a special predicate and we are going toindex/synchronize resources in the external store
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA{inst:my_index :createConnector ’’’# tell us what resources you want to index and which predicate# values about them. We support predicate chains if the properties of# the resource are not direct’’’
}
When you add/change resources that match the criteria, we willautomatically update the external store
NoteThis might slow your insert statements a little bit(our tests show that this time isnegligible)
Nikola Petrov<[email protected]> GraphDB Connectors
Example data with wines
Nikola Petrov<[email protected]> GraphDB Connectors
Example data with wines
Nikola Petrov<[email protected]> GraphDB Connectors
Example data with wines
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .@prefix : <http://www.ontotext.com/example/wine#> .
:RedWine rdfs:subClassOf :Wine .:WhiteWine rdfs:subClassOf :Wine .:RoseWine rdfs:subClassOf :Wine .
:Merlordf:type :Grape ;rdfs:label "Merlo" .
:CabernetSauvignonrdf:type :Grape ;rdfs:label "Cabernet Sauvignon" .
:CabernetFrancrdf:type :Grape ;rdfs:label "Cabernet Franc" .
:PinotNoirrdf:type :Grape ;rdfs:label "Pinot Noir" .
:Chardonnayrdf:type :Grape ;rdfs:label "Chardonnay" .
Nikola Petrov<[email protected]> GraphDB Connectors
Example data with wines
:Yoyowinerdf:type :RedWine ;:madeFromGrape :CabernetSauvignon ;:hasSugar "dry" ;:hasYear "2013"^^xsd:integer .
:Franvinordf:type :RedWine ;:madeFromGrape :Merlo ;:madeFromGrape :CabernetFranc ;:hasSugar "dry" ;:hasYear "2012"^^xsd:integer .
:Noiretterdf:type :RedWine ;:madeFromGrape :PinotNoir ;:hasSugar "medium" ;:hasYear "2012"^^xsd:integer .
:Blanquitordf:type :WhiteWine ;:madeFromGrape :Chardonnay ;:hasSugar "dry" ;:hasYear "2012"^^xsd:integer .
:Rozovardf:type :RoseWine ;:madeFromGrape :PinotNoir ;:hasSugar "medium" ;:hasYear "2013"^^xsd:integer .
Nikola Petrov<[email protected]> GraphDB Connectors
Create our wines index
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA {inst:wines :createConnector ’’’
{"elasticsearchNode": "localhost:9300","types": [
"http://www.ontotext.com/example/wine#Wine"],"fields": [
{"fieldName": "grape","propertyChain": [
"http://www.ontotext.com/example/wine#madeFromGrape","http://www.w3.org/2000/01/rdf-schema#label"
]},{
"fieldName": "sugar","propertyChain": ["http://www.ontotext.com/example/wine#hasSugar"],"sort": true
},{
"fieldName": "year","propertyChain": ["http://www.ontotext.com/example/wine#hasYear"]
}]
}’’’ .}
Nikola Petrov<[email protected]> GraphDB Connectors
But wait, I don’t understand all this json stuff...
Nikola Petrov<[email protected]> GraphDB Connectors
But wait, I don’t understand all this json stuff...
Nikola Petrov<[email protected]> GraphDB Connectors
Elasticsearch index
Given our wines small dataset, here is what will get indexed inelasticsearch
wine year grape sugar:Yoyowine 2013 Cabernet Sauvignon dry
:Noirette 2012 Pinot Noir medium:Blanquito 2012 Chardonnay dry:Franvino 2012 Merlo, Cabernet Franc dry:Rozova 2013 Pinot Noir medium
Nikola Petrov<[email protected]> GraphDB Connectors
You can query the index and join the results in SPARQL
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
SELECT ?entity {?search a inst:<index-name> ;
:query "<full-text-query>" ;:entities ?entity .
# Do anything you want with the bound entity}
Nikola Petrov<[email protected]> GraphDB Connectors
Get wines that are made from cabernet grape
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>PREFIX wine: <http://www.ontotext.com/example/wine#>
SELECT ?wine ?year {?search a inst:wines ;
:query "grape:cabernet" ;:entities ?wine .
# We combine the results from elasticsearch and join them in graphdb?wine wine:hasYear ?year
}
Wine Year:Yoyowine 2013:Franvino 2012
Nikola Petrov<[email protected]> GraphDB Connectors
Get wines that are made from cabernet grape
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>PREFIX wine: <http://www.ontotext.com/example/wine#>
SELECT ?wine ?year {?search a inst:wines ;
:query "grape:cabernet" ;:entities ?wine .
# We combine the results from elasticsearch and join them in graphdb?wine wine:hasYear ?year
}
Wine Year:Yoyowine 2013:Franvino 2012
Nikola Petrov<[email protected]> GraphDB Connectors
Other things that we expose for the matches
The matching snippet for the full-text query
The score from elasticsearch for the match
Total number of hits in the external store
Ordering the external store results
Nikola Petrov<[email protected]> GraphDB Connectors
Other things that we expose for the matches
The matching snippet for the full-text query
The score from elasticsearch for the match
Total number of hits in the external store
Ordering the external store results
Nikola Petrov<[email protected]> GraphDB Connectors
Other things that we expose for the matches
The matching snippet for the full-text query
The score from elasticsearch for the match
Total number of hits in the external store
Ordering the external store results
Nikola Petrov<[email protected]> GraphDB Connectors
Other things that we expose for the matches
The matching snippet for the full-text query
The score from elasticsearch for the match
Total number of hits in the external store
Ordering the external store results
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase LCIE Co-occurrence
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase LCIE Co-occurrence
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase LCIE Facets and Full-text
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase LCIE Facets and Full-text
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase Global Capital
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
SELECT ?facetName ?facetValue ?facetCount WHERE{# note empty query is allowed and will just match all documents, hence no :query?r a inst:wines_index ;
:facetFields "year,sugar" ;:facets _:f .
_:f :facetName ?facetName ._:f :facetValue ?facetValue ._:f :facetCount ?facetCount .
}
Facet Name Facet Value Facet Countyear 2012 3year 2013 2
sugar dry 3sugar medium 2
Nikola Petrov<[email protected]> GraphDB Connectors
Faceting Usecase
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
SELECT ?facetName ?facetValue ?facetCount WHERE{# note empty query is allowed and will just match all documents, hence no :query?r a inst:wines_index ;
:facetFields "year,sugar" ;:facets _:f .
_:f :facetName ?facetName ._:f :facetValue ?facetValue ._:f :facetCount ?facetCount .
}
Facet Name Facet Value Facet Countyear 2012 3year 2013 2
sugar dry 3sugar medium 2
Nikola Petrov<[email protected]> GraphDB Connectors
Entity filtering(Advanced topic)
A way to filter resources that land in the external store based thefield value
Allows you to use the same predicate/property chain for differentfields(based on a filter)
Consider the usecase of articles that reference people, locations,etc
Nikola Petrov<[email protected]> GraphDB Connectors
Entity filtering(Advanced topic)
A way to filter resources that land in the external store based thefield value
Allows you to use the same predicate/property chain for differentfields(based on a filter)
Consider the usecase of articles that reference people, locations,etc
Nikola Petrov<[email protected]> GraphDB Connectors
Entity filtering(Advanced topic)
A way to filter resources that land in the external store based thefield value
Allows you to use the same predicate/property chain for differentfields(based on a filter)
Consider the usecase of articles that reference people, locations,etc
Nikola Petrov<[email protected]> GraphDB Connectors
Entity filtering
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA {inst:my_index :createConnector ’’’{
"elasticsearchNode": "localhost:9200","types": ["http://www.ontotext.com/example2#Article"],"fields": [
{"fieldName": "comment","propertyChain": ["http://www.w3.org/2000/01/rdf-schema#comment"]
},{"fieldName": "taggedWithPerson","propertyChain": ["http://www.ontotext.com/example2#taggedWith"]
},{
"fieldName": "taggedWithLocation","propertyChain": ["http://www.ontotext.com/example2#taggedWith"]
}],"entityFilter": "?taggedWithPerson type in
(<http://www.ontotext.com/example2#Person>) &&?taggedWithLocation type in (<http://www.ontotext.com/example2#Location>)"
}’’’ .
}
Nikola Petrov<[email protected]> GraphDB Connectors
Things that don’t work right now but will in the near future
support for remote external stores(solr, elasticsearch) in graphdbcluster setup
filtering anywhere in the property chain(we support only filteringthe indexed value)
filtering implicit from explicit statements
Nikola Petrov<[email protected]> GraphDB Connectors
Query tree
Nikola Petrov<[email protected]> GraphDB Connectors
Thank You
https://confluence.ontotext.com/display/GraphDB6/GraphDB+Connectors
Questions?
Nikola Petrov<[email protected]> GraphDB Connectors