Upload
couchbase
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
GLOBAL SECONDARY INDEXESNEW HIGH PERFORMANCE INDEXER
Cihan BiyikogluDir. Product Management
©2015 Couchbase Inc. 2
Goals
Get to know Global Secondary Indexes (GSI) – the new high performance indexer for N1QL
Look at Indexing lifecycle & management with GSI
Cover top best practices and tips with GSI
©2015 Couchbase Inc. 3
Agenda Overview
Indexing in Couchbase Server 4.0 Couchbase Server 4.0 Architecture Indexers in Couchbase Server 4.0 Indexing today and Indexing with GSI
Working with Global Secondary Indexes GSI Architecture GSI Lifecycle - Creation & Maintenance Index Availability & Rebalance Index Placement and Load Balancing Monitoring GSI Best Practices for GSI
Q&A
Overview
©2015 Couchbase Inc. 5
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Managed CacheStorage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
©2014 Couchbase Inc.
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Managed CacheStorage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
©2015 Couchbase Inc. 7
Indexing in Couchbase Server 4.0 Multiple Indexers
GSI – Index ServiceNew indexing for N1QL for low latency queries without compromising on mutation performance (insert/update/delete)Independently partitioned and independently scalable indexes in Indexing Service
Map/Reduce Views – Data ServicePowerful programmable indexer for complex reporting and indexing logic. Full partition alignment and paired scalability with Data Service.
Spatial View – Data ServiceIncremental R-tree indexing for powerful bounding-box queriesFull partition alignment and paired scalability with Data Service
New
©2015 Couchbase Inc. 8
Query and Index Today
Once upon a time in a User Profile System…. Q1: Find the top 10 most “active” customer by
#logins in Jan 2015
{…“customer_name” : ”Cihan”,“total_logins”: {…
“aug_2015”:100,…}
“type” : “customer_profile”…}
…
Q1Active @ Jan
2015
©2015 Couchbase Inc. 9
Query and Index TodayINDEX ON Customer_bucket(customer_name, total_logins.jan_2015)WHERE type=“customer_profile”;
SELECT customer_name, total_logins.jan_2015 FROM customer_bucketWHERE type=“customer_profile”ORDER BY total_logins.jan_2015 DESC LIMIT 10;
…
Q1Active @ Jan
2015
Q1: Execution Plan on N nodes• Scatter: Execute Q1 on N nodes• Gather: gather N results• Finalize: Execute Q1 on
governor node
1
2 2 2 2 2
3
123
©2015 Couchbase Inc. 10
Query and Index with GSIINDEX ON Customer_bucket(customer_name, total_logins.jan_2015)WHERE type=“customer_profile”;
SELECT customer_name, total_logins.jan_2015 FROM customer_bucketWHERE type=“customer_profile”ORDER BY total_logins.jan_2015 DESC LIMIT 10;
…
Q1Active @ Jan
2015
Q1: Execution Plan on N nodes• Execute Q1 on N1QL Service
node• Scan index on Index Service node
12
13
©2015 Couchbase Inc. 11
Introducing Global Secondary Indexes
What are Global Secondary Indexes? High performance indexes for low latency queries with powerful caching, storage and independent placement.
Power of GSI Fully integrated into N1QL Query Optimization and
Execution Independent Index Distribution for Limiting scatter-
gather Independent Scalability with Index Service – more on
this later Powerful caching and storage with ForestDB
©2015 Couchbase Inc. 12
Which to choose – GSI vs Views
Workloads New GSI in v4.0
Map/Reduce Views
Complex Reporting
Just In Time Aggregation Pre-aggregated
Workload Optimization
Optimized for Scan Latency & Throughput
Optimized for Insertion
Flexible Index Logic
N1QL Functions Javascript
Secondary Lookups
Single Node Lookup Scatter-Gather
Tunable Consistency
Staleness false or ok or everything in between
Staleness false or ok
©2015 Couchbase Inc. 13
Which to choose – GSI vs Views
Capabilities New GSI in v4.0
Map/Reduce Views
Partitioning Model Independent – Indexing Service
Aligned to Data – Data Service
Scale Model Independently Scale Index Service
Scale with Data Service
Fetch with Index Key Single Node Scatter-Gather
Range Scan Single Node Scatter-Gather
Grouping, Aggregates With N1QL Built-in with Views API
Caching Managed Not Managed
Storage ForestDB Couchstore
Availability Multiple Identical Indexes load balanced
Replica Based
Deep Dive
GSI Architecture
©2015 Couchbase Inc. 16
Data Service
Projector & Router
Indexing Service
Query ServiceIndex Service
SupervisorIndex maintenance &
Scan coordinator
Index#2
Index#1
Query Processorcbq-engine
Bucket#1
Bucket#2
DCP StreamIndex#4Index#3
...
Bucket#2
Bucket#1
Projector and Router: 1 Projector and Router per node1 stream of changes per buckets per supervisor
ForestDBStorage Engine
Supervisor1 Supervisor per nodeMany indexes per Supervisor
©2015 Couchbase Inc. 17
Deeper Dive into Architecture
@3.45 - Architecture Track
Deep Dive into Global Secondary Indexing Architecture in Couchbase
Server 4.0
John Liang, Architect, Couchbase
GSI Lifecycle
©2015 Couchbase Inc. 19
Indexing Lifecycle
Primary vs Secondary Primary Index is a full list of document keys within a given bucket
CREATE PRIMARY INDEX index_nameON bucket_name USING GSI|VIEWWITH `{"nodes”: [“node_name”], “defer_build”:true}`; //GSI-ONLY
Secondary Index is an index on a field/expression on a subset of documents for lookups
CREATE INDEX index_nameON bucket_name (field/expression, …)USING GSI|VIEWWHERE filter_expressionsWITH `{"nodes”: [“node_name”], “defer_build”:true}`; //GSI-ONLY
©2015 Couchbase Inc. 20
Deferred Index Building
Index building can be deferred to build multiple indexes all at once with greater scan efficiency.
CREATE INDEX … WITH {…“defer_build”:true};CREATE INDEX … WITH {…“defer_build”:true};…BUILD INDEX ON bucket_name(index_name1, …) USING GSI;
DEMOQuick tour of GSI
GSI Partitioning and Placement
©2015 Couchbase Inc. 23
GSI Placement and Partitioning
Place GSI Indexes using NODES clause Each GSI reside on 1 node
You can specify the node using nodes clause
You can scale out the index by creating identical indexes (load balanced)
CREATE INDEX i1 … WITH {“nodes”:”node1”};
CREATE INDEX i1 … WITH {“nodes”:”node1”};CREATE INDEX i2 … WITH {“nodes”:”node2”};
…
©2015 Couchbase Inc. 24
GSI Placement and Partitioning
Partition Indexes Manually with WHERE clause
You can partition with the WHERE clause and place on various nodes for scaling outCREATE INDEX i1 … WHERE zipcode between “94000” and
“94999” …CREATE INDEX i2 … WHERE zipcode between “95000” and
“96000” …
GSI Availability and Rebalance
©2015 Couchbase Inc. 26
GSI Availability and Rebalance
Use multiple identical indexes for availability
GSI Availability Create multiple identical indexes on separate nodes for
availability GSI will auto divert traffic if any copy goes down
GSI & Rebalance Removing/Failing a node with index service, remove the
GSI indexes on that node Adding a node with index service, won’t automatically
move some indexes to the node.
Monitoring GSI
©2015 Couchbase Inc. 28
Monitoring GSI Indexes
Index Size and Maintenance StatsIndex Scan Stats
Best Practices with GSI
©2015 Couchbase Inc. 30
New Consistency Settings!
View Stale-ness Ok: unbounded – query what’s available in the
index/view now False: query after all changes up to the request
timestamp (and maybe more) has been indexed for a given index or view.
New Indexes with Couchbase Server 4.0 Improves granularity of the consistency logical-
timestamp. New: Scan Consistency can be set to any logical
timestamp
Indicate stale=false to stale=ok and everything in between
©2015 Couchbase Inc. 31
Flexible Consistency Settings Time
t1 insert (k1, v1)…
t2 do other business logic computation…
t3 issue query/read on (k1,v1) with t3 vs t1
Catch up all the indexes to t3 and then issue query
Identical to “stale=false”
Catch up all the indexes to t1 and then issue query
Improved efficiency over “stale=false”
©2015 Couchbase Inc. 32
Complex Types and GSI
Indexing Complex Types Sub-documents attributes
CREATE INDEX ifriend_id ON default(friends.id) USING GSI;
SELECT * FROM default WHERE friends.id= "002819”;
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
©2015 Couchbase Inc. 33
Complex Types and GSI
Indexing Complex Types Compound Keys
CREATE INDEX ifriends_id_class ON default(friends.id, friends.class) USING GSI;
SELECT * FROM default WHERE friends.id="002819" and friends.class="005”;
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
©2015 Couchbase Inc. 34
Complex Types and GSI
Indexing Complex Types Sub-documents
CREATE INDEX ifriend ON default(friends) USING GSI;
SELECT * FROM default WHERE friends= {"class": "005","id":"002819"};
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
©2015 Couchbase Inc. 35
Complex Types and GSI
Indexing Complex Types Arrays
CREATE INDEX itags_sorted ON default(ARRAY_SORT(tags)) USING GSI;
SELECT * FROM default WHERE tags= ARRAY_SORT([0,1,2,3,4,5,6,7,8,9]);
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
Q&ACihan Biyikoglu
[email protected]@cihangirb
Get Started Today Couchbase Server 4.0 & N1QL
Couchbase.com/beta
Thank you.