Upload
wolf4ood
View
1.240
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Enrico RisaThe Dynamic Duo OrientDB & Lucene
Outline
❖ Apache Lucene in a nutshell!
❖ OrientDB Indexing!
❖ OrientDB-Lucene - Full Text Index - Spatial Index!
❖ Roadmap 2.0
What Is Lucene?❖ Free-text indexing library!
❖ Implements standard IR/search functionality ● Query models, ranking, indexing!
❖ Written in Java!
❖ Simple Api!
❖ Fast, Mature and constantly evolving!
❖ Many extension points
Who uses Lucene?❖ Twitter!
❖ Linkedin!
❖ Apple!
❖ Solr!
❖ Elastic Search!
❖ Neo4J!
❖ and now OrientDB
Base Lucene workflow
Documents
❖ Basic Unit for indexing and searching!
❖ Contains a list of Fields!
❖ Schema-less
Fields
❖ Basic component of a Document!
❖ Fields- name - value - store - analyzed
Fields Types & Options❖ Types
-Field-StringField-TextField-StoredField-IntField-…More!
❖ Options-Stored or Not -Indexed or not -Analyzed or not
Directory
❖ RAMDirectory Ram based index!
❖ FSDirectory File-based index!
❖ NIOFSDirectory Same as FSDirectory but using NIO api.
Indexing Documents
Searching Index
Inverted Index
Luke: a graphical user interface
❖ Open Lucene Index!
❖ Browse documents!
❖ Run query!
❖ ….
OrientDB Indexing❖ SBTree
(Unique,Not unique, Full Text, Dictionary)!
❖ HashIndex (Unique,Not unique, Full Text, Dictionary)!
❖ MVRB-Tree (Deprecated since 1.6)!
❖ Lucene (OrientDB-Lucene)!
❖ … https://github.com/orientechnologies/orientdb/wiki/Custom-Index-Engine
OrientDB Lucene
❖ Open Source at https://github.com/orientechnologies/orientdb-lucene!
❖ This project aims to bring the power of Lucene index into OrientDB.!
❖ Supports only Spatial Index And Full Text
Installing OrientDB Lucene
❖ Embedded Mode
❖ Server Mode Grab a jar build and copy it into $ORIENTDB_HOME/plugins
Spatial Index
❖ No native implementation.!
❖ Build on top Lucene-Spatial Module.!
❖ Currently only points are supported.!
❖ Near and Within query.
Lucene Spatial
❖ Spatial4j- Handle Shapes (Point,Circle,Rectangle, Polygon) - Distance and Area math utitilities - Read WKT format!
❖ Provide Indexing Strategy - RecursivePrefixTree!
❖ Spatial Query using Shapes
Creating a Spatial Index❖ SQL
❖ JAVA
Spatial Operators
❖ NEAR Find all Points near a given location (latitude,longitude)!
❖ WITHIN Find all Points within a Given Bounding Box
Near Operator❖ Custom Operator that rely on Lucene Index!
❖ Special Syntax to support spatial args ($spatial)!
❖ Context variable $distance!
❖ Result set sorted from nearest to farthest.
Within Operator❖ Bounding Box Search!
❖ Currently Points within Box!
❖ Result set not sorted
Full Text Index
❖ Native Full Text Implementation.!
❖ Supports multiple fields.!
❖ Supports Lucene query syntax.!
❖ Lucene Analyzers
Creating a Full Text Index❖ SQL
❖ JAVA
Full Text Operators
❖ LUCENE[<fields>] LUCENE <exp>- Query your index using Query Parser syntax - Support Multiple fields- Target all fields (MultiFieldQueryParser) - Target specific field (QueryParser)
Lucene Operator❖ MultiFieldQueryParser
Target all fields
❖ QueryParser Target specific field
Indexing Performance
❖ Full Text - 9M records in ~300s with StandardAnalyzer and one field!
❖ Spatial 9M records in ~500s with two field (Point)
Roadmap 2.0
❖ Production Ready!
❖ Monitoring lucene index!
❖ More configuration!
❖ Gui tool integrated in Studio
Roadmap 2.0 (Spatial Index)
❖ Index more shape!
❖ More operators (Intersect..)!
❖ Not only BBox!
❖ Support for GeoJson http://geojson.org
Roadmap 2.0 (Full Text)
❖ Document & Field Boosting!
❖ Score in result set!
❖ Custom Analyzers & Filters!
❖ Search Engine
Thank You Questions?
❖ Contact Me - Enrico Risa [email protected] - Twitter https://twitter.com/wolf4ood