24
Advanced usage of indexes in Oracle Coherence Alexey Ragozin [email protected] Nov 2011

Advanced Usage of Indexes in Coherence

  • Upload
    gpeev

  • View
    102

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Usage of Indexes in Coherence

Advanced usage of indexes in Oracle Coherence

Alexey [email protected]

Nov 2011

Page 2: Advanced Usage of Indexes in Coherence

Presentation overview

Structure of Coherence indexHow IndexAwareFilterworksMultiple indexes in same queryCustom index provider API (since 3.6)Embedding Apache Lucene into data grid

Page 3: Advanced Usage of Indexes in Coherence

Creation of index

QueryMap.addIndex(ValueExtractor extractor, boolean ordered, Comparator comparator)

Attribute extractor, used to identify index later

Index configuration

Page 4: Advanced Usage of Indexes in Coherence

Using of query API

public interface QueryMap extends Map Set keySet(Filter f);;Set entrySet(Filter f);;Set entrySet(Filter f, Comparator c);;...

public interface InvocableMap extends Map Map invokeAll(Filter f, EntryProcessor agent);;Object aggregate(Filter f, EntryAggregator agent);;...

Page 5: Advanced Usage of Indexes in Coherence

Indexes at storage node

extractor indexextractor index

extractor index

Indexes

Backing map

Named cache backend

SimpleMapIndexReverse map Forward map

valkeykeykeykey

val keykey

val keykey

key valkey valkey val

Page 6: Advanced Usage of Indexes in Coherence

Indexes at storage node

All indexes created on cache are stored in map

Reverse map is used to speed up filtersForward map is used to speed up aggregators

Custom extractors should obey equals/hashCode contract!

QueryMap.Entry.extract

is using index, if available

Page 7: Advanced Usage of Indexes in Coherence

Indexes at storage node

Index structures are stored in heapand may consume a lot of memory

For partitioned schemekeys in index are binary blobs,regular object, otherwise

Indexes will keep your key in heap even if you use off heap backing mapSingle index for all primary partitions of cache on single node

Page 8: Advanced Usage of Indexes in Coherence

How filters use indexes?

interface IndexAwareFilter extends EntryFilter int calculateEffectiveness(Map im, Set keys);;Filter applyIndex(Map im, Set keys);;

applyIndexcalculateEffectivenessnested filterseach node executes index individuallyFor complex queries execution plan is calculated ad hoc, each compound filter calculates plan for nested filters

Page 9: Advanced Usage of Indexes in Coherence

Example: equalsFilter

Filter execution (call to applyIndex() )Lookup for matching index using extractor instance as keyIf index found,

lookup index reverse map for valueintersect provided candidate set with key set from reverse mapreturn null candidate set is accurate, no object filtering required

else (no index found)return this all entries from candidate set should be deserialized and evaluated by filter

Page 10: Advanced Usage of Indexes in Coherence

Multiple indexes in same query

Example: ticker=IBM & side=Bnew AndFilter(

new EqualsFilter getTickernew EqualsFilter getSide

Execution plancall applyIndex

only entries with ticker IBM are retained in candidate set

call applyIndexonly entries with side=B are retained in candidate set

return candidate set

Page 11: Advanced Usage of Indexes in Coherence

Index performance

PROsusing of inverted indexno deserialization overhead

CONsvery simplistic cost model in index plannercandidate set is stored in hash tables (intersections/unions may be expensive)high cardinality attributes may cause problems

Page 12: Advanced Usage of Indexes in Coherence

Compound indexes

Example: ticker=IBM & side=BIndex per attributenew AndFilter(new EqualsFilter getTicker , new EqualsFilter getSide )

Index for compound attributenew EqualsFilter(new MultiExtractor getTicker, getSide ,

Arrays.asList

extractor used to create index!

Page 13: Advanced Usage of Indexes in Coherence

Ordered indexes vs. unordered

19.23

1.63 1.37

0.61 0.721.19

0.1

1

10

100

Term count = 100k Term count = 10k Term count = 2k

Filte

r executio

n tim

e (m

s)

Unordered Ordered

Page 14: Advanced Usage of Indexes in Coherence

Custom indexes since 3.6

interface IndexAwareExtractorextends ValueExtractor

MapIndex createIndex(boolean ordered,Comparator comparator,Map indexMap,BackingMapContext bmc);;

MapIndex destroyIndex(Map indexMap);;

Page 15: Advanced Usage of Indexes in Coherence

Ingredients of customs index

Custom index extractorCustom index class (extends MapIndex)Custom filter, aware of custom index

+Thread safe implementationHandle both binary and object keys gracefullyEfficient insert (index is updates synchronously)

Page 16: Advanced Usage of Indexes in Coherence

Why custom indexes?

Custom index implementation is free to use any advanced data structure tailored for specific queries.

NGram index fast substring based lookupApache Lucene index full text searchTime series index managing versioned data

Page 17: Advanced Usage of Indexes in Coherence

Using Apache Lucene in grid

Why?Full text search / rich queriesZero index maintenance

PROsIndex partitioning by CoherenceFaster execution of many complex queries

CONsSlower updatesText centric

Page 18: Advanced Usage of Indexes in Coherence

Lucene example

Step 1. Create document extractor// First, we need to define how our object will map // to field in Lucene document LuceneDocumentExtractor extractor = new LuceneDocumentExtractor();;extractor.addText("title", new ReflectionExtractor("getTitle"));;extractor.addText("author", new ReflectionExtractor("getAuthor"));;extractor.addText("content", new ReflectionExtractor("getContent"));;extractor.addText("tags", new ReflectionExtractor("getSearchableTags"));;

Step 2. Create index on cache// next create LuceneSearchFactory helper class LuceneSearchFactory searchFactory = new LuceneSearchFactory(extractor);;// initialize index for cache, this operation actually tells coherence // to create index structures on all storage enabled nodes searchFactory.createIndex(cache);;

Page 19: Advanced Usage of Indexes in Coherence

Lucene example

Now you can use Lucene queries// now index is ready and we can search Coherence cache // using Lucene queries PhraseQuery pq = new PhraseQuery();;pq.add(new Term("content", "Coherence"));;pq.add(new Term("content", "search"));;// Lucene filter is converted to Coherence filter // by search factory cache.keySet(searchFactory.createFilter(pq));;

Page 20: Advanced Usage of Indexes in Coherence

Lucene example

You can even combine it with normal filters// You can also combine normal Coherence filters// with Lucene queries long startDate = System.currentTimeMillis() -­ 1000 * 60 * 60 * 24;;

// last day long endDate = System.currentTimeMillis();;BetweenFilter dateFilter = new BetweenFilter("getDateTime", startDate, endDate);;

Filter pqFilter = searchFactory.createFilter(pq);;// Now we are selecting objects by Lucene query and apply // standard Coherence filter over Lucene result set cache.keySet(new AndFilter(pqFilter, dateFilter));;

Page 21: Advanced Usage of Indexes in Coherence

Lucene search performance

0.72

0.71

1.10

1.09

3.30

1.80

1.16

1.18

4.38

4.39

1.93

1.96

2.38

2.38

0.67

7.23

1.49

7.77

8.81

8.75

1.53

8.66

15.96

15.96

11.15

11.12

52.59

8.74

0.5 5 50

A1=x & E1=y

E1=x & A1=y

D1=x & E1=y

E1=x & D1=y

E1=x & E2=y

E1=x & E2=Y & E3=z

D1=w & E1=x & E2=Y & E3=z

E1=x & E2=Y & E3=z & D1=w

A2 in [n..m] & E1=x & E2=Y & E3=z

E1=x & E2=Y & E3=z & A2 in [n..m]

H1=a & E1=x & E2=Y & E3=z

E1=x & E2=Y & E3=z & H1=a

Filter execution time (ms)

Lucene

Coherence

Page 22: Advanced Usage of Indexes in Coherence

Time series index

Special index for managing versioned data

Getting last version for series kselect * from versions where series=k and version =

(select max(version) from versions where key=k)

Series key Entry id Timestamp Payload

Entry key Entry value

Cache entry

Page 23: Advanced Usage of Indexes in Coherence

Time series index

Series inverted index

Series key

Series key

Series key

Series key

Series key

HA

SH

TAB

LE

Timestamp Entry ref

Timestamp Entry ref

Timestamp Entry ref

Timestamp inverted subindex

ORDER

Page 24: Advanced Usage of Indexes in Coherence

Thank you

Alexey [email protected]

http://aragozin.blogspot.com-­‐ my articleshttp://code.google.com/p/gridkit-­‐ my open source code