10 Reasons why Redis should be your Primary Database€¦ · The Database Market 2 NOSQL 35.0% CAGR...

Preview:

Citation preview

10 Reasons why Redis should be your Primary DatabaseYIFTACH SHOOLMANCO-FOUNDER AND CTO @ REDIS LABS

The Database Market

2

NOSQL35.0%CAGR

2016-21

RELATIONAL7.5%CAGR

2016-21

Source: 451 Research Total Data Market Monitor

DB-Engines Ranking

3

Docker Hub: The World’s Most Popular Database# of containers launched as of Feb 2018

630M+ (1.87M/day, 78K/hr, 1.28K/min)

308M+

263M+

24M+

4

Redis Enterprise

5

DBaaS• Available since mid 2013• 8,100+ enterprise customers

Software• Available since early 2015• 300+ enterprise customers

550K+ databases managed worldwide

• 6 of top Fortune 10 companies

• 3 of top 5 communications companies

Customers• 3 of top 4 credit card issuers

• 3 of top 5 healthcare companies

6

It’s Fast (Extremely Fast) and Scales Linearly

1

Redis Enterprise Cluster

Node 1 Node 2 Node N (odd number)

7

Uneven number of symmetric nodes

Redis Enterprise Cluster

Node 1 Node 2 Node N (odd number)

8

Single master database

M

Redis Enterprise Cluster

Node 1 Node 2 Node N (odd number)

9

An HA database

SM

Redis Enterprise Cluster

Node 1 Node 2 Node N (odd number)

10

A Clustered Database

M1 M2 M3

Redis Enterprise Cluster

Node 1 Node 2 Node N (odd number)

11

A Clustered Database

M1 M2 M3S3 S1 S2

Redis Enterprise Node

12

Cluster Manager

Enterprise Layer

Open Source Layer

REST APIZero latency proxy

Redis Shards

Redis Enterprise: Shared Nothing Symmetric Architecture

Node 1 Node 2 Node N (odd number)

Redis Shards& Proxies

Data Path

13

Data-Path and Control/Management Path Separation

Redis Enterprise: Shared Nothing Symmetric Architecture

ClusterManagement

Path

Node WatchdogCluster Watchdog

Node 1 Node 2 Node N (odd number)

Redis Shards& Proxies

Data Path

14

Data-Path and Control/Management Path Separation

15

Seamless Resharding

M

S

M1

M2

S1

S2

16

Seamless Resharding

Proxy

M

S

17

Seamless Resharding

Proxy

1 Bring trimmed slaves

S1

S2

½ of the dataset

½ of the dataset

M

S

18

Seamless Resharding

Proxy

2 Start draining

S1

S2

Until no outstanding

requests (~msec)

M

S

19

Seamless Resharding

Proxy

3 MàM1, S->M2 & stop draining

S1

S2

M1

M2

20

Seamless Resharding

Proxy

4 Trim M1, M2

S1

S2

M1

M2

21

Scaling Out / In

Scaling out, Resharding & Rebalancing

Scale out Rebalancing Resharding

x2 faster

Application1 2 3 4

5 6 7

12

34

56

7

Multiplexing & pipelining

With Proxy

Proxy

How Does The Proxy Work ?

Single-proxy – dense policy Multi-proxy – sparse policy

Scale-Out Proxy

Single Database Endpoint Single Database Endpoint

Scaling Linearly with OSS Cluster API

Node 1

Proxy 1 Shard 1 Shard 2 Shard n/m

Node 2

Proxy 2 Shard n/m+1 Shard n/2mShard n/m+2

Node M

Proxy M Shard n-m+1 Shard nShard n-m+2

M nodes Redis Enterprise Cluster with N shards database

Client 1

Client 2

Client k

HS ranges = {ranges #1, ranges #2, ranges #n/m}

HS ranges = {ranges #n/m+1, ranges #n/m+2, ranges #n/2m}

HS ranges = {ranges #n-m+1, ranges #n-m+2, ranges #n}

6 nodes, 120 shards cluster 12 nodes, 240 shards cluster 18 nodes, 360 shards cluster

True Linear Scalability

6 nodes, 120 shards cluster 12 nodes, 240 shards cluster 18 nodes, 360 shards cluster

True Linear Scalability

6 nodes, 120 shards cluster 12 nodes, 240 shards cluster 18 nodes, 360 shards cluster

Sub-millisecond latency is maintained across all the tests

True Linear Scalability

28

It’s Highly Available

2

29

Lessons learned

m4.largea quorum node

with no-data

Redis Enterprise

Node 3

• 5+ years in production

• 550K+ database created

• 50+ data-centers/zones

• 2000+ node failure events

• 100+ complete data-center outages

30

HA Concept #1– Quorum by Nodes, not by Shards

3 replicas Redis

90GBr4.4xlarge

m4.largea quorum node

with no-data

Redis Enterprise

90GBr4.4xlarge90GB

r4.4xlarge90GB

r4.4xlarge90GB

r4.4xlarge

M1 S1 S2

Node 1 Node 2 Node 3 Node 1 Node 2 Node 3

M1 S1

31

HA Concept #1 – Quorum by Nodes, not by Shards

3 replicas Redis

90GBr4.4xlarge

m4.largea quorum node

with no-data

Redis Enterprise

90GBr4.4xlarge90GB

r4.4xlarge90GB

r4.4xlarge90GB

r4.4xlarge

M1 S1 S2

Node 1 Node 2 Node 3 Node 1 Node 2 Node 3

M1 S1

32

HA Concept #1 – Quorum by Nodes, not by Shards

3 replicas Redis

90GBr4.4xlarge

m4.largea quorum node

with no-data

Redis Enterprise

90GBr4.4xlarge90GB

r4.4xlarge90GB

r4.4xlarge90GB

r4.4xlarge

M1 S1 S2

Node 1 Node 2 Node 3 Node 1 Node 2 Node 3

M1 S1

• ~30% infrastructure cost savings

• Less network traffic

• Easy to manage

33

HA Concept #2 – Pure In-Memory Replication

Disk-based Replication(OSS default)

M S

OSS Diskless Replication

M S

Pure In-Memory Replication

M S

1

2

3 1 2

1

34

HA Concept #2 – Pure In-Memory Replication

Disk-based Replication(OSS default)

M S

OSS Diskless Replication

M S

Pure In-Memory Replication

M S

1

2

3 1 2

1

35

HA Concept #2 – Pure In-Memory Replication

Disk-based Replication(OSS default)

M S

OSS Diskless Replication

M S

Pure In-Memory Replication

M S

1

2

3 1 2

1

x2 faster

36

HA Concept #3 – Watchdogs are Part of the Cluster

37

HA Concept #3 – Watchdogs are Part of the Cluster

HA Concept #4 – How to deploy a Multi-AZ/Rack Cluster

1. At least 3 AZs/Racks

2. Distance between Azs/Racks

< 10msec

3. Master and Slave of the same

shard must be deployed

on different AZs/Rack

4. For every i,j,k:

#_of_node ( AZi +AZj) > #_of_node AZk

Redis Failover Benchmark

% o

f tim

es d

ata

was

lost

Aver

age

time

to re

cove

r in

seco

nds

Redis Enterprise

AWS ElastiCache

Heroku Redis

Compose (IBM) Redis

Azure Redis Cache

<5sec

40

It’s Durable

3

Data Loss

SSD - persistent and ephemeral

data

Failed Instance

SSD - persistent and ephemeral

data

New Empty Instance

Data-Persistence - The Wrong Way

Uses Network Attached Persistent Storage, not Ephemeral

Discuss SQL Strategy

Data Loss

SSD - persistent and ephemeral

data

Failed Instance

SSD - persistent and ephemeral

data

New Empty Instance

Data-Persistence - The Wrong Way

No Data Loss

SSD - ephemeraldata

Failed Instance

SSD - ephemeraldata

New Populated Instance

Persistent Storage

AOF, Snapshot

Data-Persistence - The Right Way

AOF, Snapshot

Data lo

ad

Uses Network Attached Persistent Storage not Ephemeral

Discuss SQL Strategy

Tunable Data Persistence Configuration

Non-Replicated

M

Tunable Data Persistence Configuration

Tuned for SpeedData-Persistence at the slave

M S

Non-Replicated

M

Tunable Data Persistence Configuration

Tuned for SpeedData-Persistence at the slave

M S

Tuned for ReliabilityData-Persistence at the master & slave

M S

Non-Replicated

M

AOF-every-sec, AOF-every-write, Snapshot (RDB)

• Redis performance during AOF rewrite

• Data-persistence when multiple Redis instances reside on the same node

Two Main Challenges with Redis Data-Persistence

Redis Enterprise Enhanced Storage Engine

Redis Labs Proprietary and Confidential Information

48

It’s Modular

4

49

Couchbase

Riak

CassandraDSE

OrientDB

Graph

Neo4j

InfiniteGraph

Titan

Key Value

Redis

RethinkDB

DynamoDB

OracleNoSQL

HBase

WideColumn

Hypertable

CloudBigTable

Document

MongoDB

CouchDB

DocumentDB

Cloudant

Aerospike

Polyglot Persistence

50

Couchbase

Riak

CassandraDSE

OrientDB

Graph

Neo4j

InfiniteGraph

Titan

Key Value

Redis

RethinkDB

DynamoDB

OracleNoSQL

HBase

WideColumn

Hypertable

CloudBigTabl

e

Document

MongoDB

CouchDB

DocumentDB

Cloudant

Aerospike

Polyglot Persistence Multi-Model

The Trend

51

Redis-ML

RediSearch ReJSON

Redis-Timeseries

Redis-Graph

Rebloom

Custom

?Rate Limiter

Multi-Model – based on Open Core with Modules

52

And It’s still Fast (extremely fast) with Modules

RedisSearch – x5

53

And It’s still Fast (extremely fast) with Modules

ReBloom – x17RedisSearch – x5

54

msec msec

RedisSearch – x5 Redis-ML – x2000

And It’s still Fast (extremely fast) with Modules

ReBloom – x17

55

msec msec

Redis-ML – x2000

And It’s still Fast (extremely fast) with Modules

Redis-GraphWait for Redisconf:

Pier 27, San-FranciscoApril 26-88

RedisSearch – x5 ReBloom – x17

56

It Uses a Different Approach for Active-Active

5

App

Eventual Consistency à 100msec

Active-Active: Existing Approaches are just Slow

App

App

App

Active-Active: Existing Approaches are just Slow

App

AppApp

App

App

Eventual Consistency à 100msec Strong Consistency à 200msec

We Need Something Faster than the Speed of Light

Light > 20msec RTT

Network > 70msec RTT

Redis < 1msec RTT

Conflict Resolution is Hard

• Application level solution à too complex to write

• LWW (Last Write Wins) à doesn’t work for many of the Redis use cases, e.g.:• Counters

• Sets

• Sorted Sets

• Lists

• Modules’ new datatypes

CRDT

• Years of academic research

• Based on consensus free protocol

• Strong eventual consistency

• Built to resolve conflicts with complex data types

The CRDT Approach

App

App

App

Strong Eventual Consistency à 1 msec

Solving Conflicts – Counters

c = 500

Replica A

c = 500

Replica B

c = 500

Replica C

63

Solving Conflicts – Counters

c = 500INCRBY 200

Replica A

c = 500

Replica B

c = 500

Replica C

64

Solving Conflicts – Counters

c = 500INCRBY 200

Replica A

c = 500DECRBY 300

Replica B

c = 500

Replica C

65

Solving Conflicts – Counters

c = 500INCRBY 200

Replica A

c = 500DECRBY 300

Replica B

c = 500INCRBY 1000

Replica C

66

Convergence Function (commutative):

500 + ∑c(i) = 500 +200 -300 +1000 = 1400

Solving Conflicts – Counters

c = 500INCRBY 200

Replica A

c = 500DECRBY 300

Replica B

c = 500INCRBY 1000

Replica C

67

Solving Conflicts – Sets

S = {A, B, C}

Replica A

S = {A, B, C}

Replica B

S = {A, B, C}

Replica C

68

Solving Conflicts – Sets

S = {A, B, C}SADD D

Replica A

S = {A, B, C}

Replica B

S = {A, B, C}

Replica C

69

Solving Conflicts – Sets

S = {A, B, C}SADD D

Replica A

S = {A, B, C}SADD A

Replica B

S = {A, B, C}

Replica C

70

Solving Conflicts – Sets

S = {A, B, C}SADD D

Replica A

S = {A, B, C}SADD A

Replica B

S = {A, B, C}SREM A

Replica C

71

Convergence Function (associative):

• S = S + D + A - A = {A, B, C, D}• Observed Removed + Add Wins

Solving Conflicts – Sets

S = {A, B, C}SADD D

Replica A

S = {A, B, C}SADD A

Replica B

S = {A, B, C}SREM A

Replica C

72

Causal Consistency

This is NOT Causal Consistency

Replica A

S = {A}

Replica B Replica C

74

SADD A

SADD A

This is NOT Causal Consistency

Replica A

S = {A}

Replica B Replica C

75

SADD A

S = {A, B}

SADD B

SADD A

S = {B, A}

This is Casual Consistency

Replica A

S = {A}

Replica B Replica C

76

SADD A

S = {A, B}

SADD B

SADD A

S = {A, B}

Active-Active: Comparison

App

App

AppApp

App

App

Eventual Consistency Strong Consistency

App

App

App

Strong Eventual Consistency+

Causal Consistency

100msec 200msec <1msec

78

It Saves You $$

6

Multi-Tenant from Day One

• Single tenant multiple

shards/DBs

• Multi-tenant multiple

shards/ DBs

• Customer B• Customer A

• Customer N

Multi-Tenant from Day One

• Single tenant multiple

shards/DBs

• OR

#1200GB

#2200GB

#50200GB 50 x r3.8xlarge instances

#51200GB

#51200GB

#100200GB 1st replica for HA

#101200GB

#102200GB

#150200GB 2nd replica for quorum

Total cost (reserved instances) = $2,132,250/yr

10TB Deployment on AWS with 2 Replicas

#1200GB

#2200GB

#50200GB 50 x r3.8xlarge instances

#51200GB

#51200GB

#100200GB 1 replica for HA

#10115GB

Total cost (reserved instances) = $1,421,500/yr Savings = $710,750/yr

1 quorum server

10TB Deployment on AWS with 1 Replica + a Quorum Server

Redis on Flash – Built for a Tiered Memory Architecture

Persistent Storage:Entire Dataset

AOF, Snapshot

SSD:Cold Values

DRAM:Keys & Hot Values

Cluster Node

83

RoF - Designed for the New Persistent Memory Technology

84

NVMe vs. SATA

RoF - Designed for the New Persistent Memory Technology

85

Optane (3DXP) vs. NVMe

$2,200,162/yr

$1,772,606/yr

$766,096/yr

$232,875/yr

$0/yr

$500,000/yr

$1,000,000/yr

$1,500,000/yr

$2,000,000/yr

$2,500,000/yr

Other Redis Provider RCP RAM Cloud-Based NoSQL RCP Flash<1 msec <10 msec <1 msec

DBaaS Price Comparison2TB Dataset with HA @ 100k ops/sec (on-demand pricing)

Up to 89% savings!

86

Other Redis Provider< 1msec

Redis Enterprise VPC< 1msec

Cloud-Based NoSQL< 10 msec

RoF< 1 msec

RoF by Numbers

• GA zero touch – 11/2017

• Quite a few customers

• All of them are using it as a primary data-store

• Database size 0.5TB à 10TB+

88

It’s Everywhere

7

Multi-Cloud / Hybrid

App

App

App

App

All Verticals

Financial Services AdvertisingMedia

Technology Communications EducationGaming

Banks E-commerce

Business Services

Social

Travel

90

Device

Devices

• Raspberry Pi support

• A single OSS Redis instance(3MB footprint)

• Persistent

• Redis Streams Client

Edge

Devices

• Raspberry Pi support

• A single OSS Redis instance(3MB footprint)

• Persistent

• Redis Streams Client

• RPi/x86 nodes (4 cores/2GB RAM /50GB SSD)

• Redis Enterprise Cluster

• Redis Streams Server & Client

• Persistent & HA

• Redis on Flash

• Modules:

‒ Search, JSON, Graph, Time-Series, ReBloom, ML

Edge

Everywhere

Devices

• Raspberry Pi support

• A single OSS Redis instance(3MB footprint)

• Persistent

• Redis Streams Client

• RPi/x86 nodes (4 cores/2GB RAM /50GB SSD)

• Redis Enterprise Cluster

• Persistent & HA

• Redis on Flash

• Redis Streams Server & Client

• Modules:

‒ Search, JSON, Graph, Time-Series, ReBloom, ML

Edge

• Large Redis Enterprise Cluster(s)

• Multi-cloud/DBaaS/Self-managed

• Multi-master geo-replication

• Redis Streams Server

• Persistent & HA (multi-az)

• Redis on Flash

• Modules:

‒ Search, JSON, Graph, Time-Series, ReBloom, ML,

Cloud

94

It Simplifies Data Services

8

Cloud Data-Services ArchitectureAWS Data-Services Flow

Cloud Data-Services ArchitectureAWS Data-Services Flow

4

1

2

5

3

The WRONG Spaghetti Architecture

AWS Data-Services Flow

=

With the RIGHT Data-Services Architecture

or

Stream API Streams Data-Structure

Multi-Function1 2 3

99

Because most of you have already been using it as a primary data-store

9

Redis Enterprise Survey Data

Use Cases

YES67%

NO33%

Primary Database Move to Redis Enterprise

101

You have some responsibility

10

One Way Ticket to the Cloud…

One Way Ticket to the Cloud…

What can you do?

?

Make the Tech World Better

Thank you!

Recommended