30
Containerized DBs In a Machine Data Environment Docker NYC, 9th March 2017 @claus__m

Containerized DBs in a Machine Data environment with Crate.io

Embed Size (px)

Citation preview

Page 1: Containerized DBs in a Machine Data environment with Crate.io

Containerized DBsIn a Machine Data Environment

Docker NYC, 9th March 2017@claus__m

Page 2: Containerized DBs in a Machine Data environment with Crate.io

Claus (me)1.5yrs at Crate.ioDevRel/Field Engineering/Support/Integrations/…

Talk to me aboutRust, Raspberry Pis, Tech!

@claus__m

Page 3: Containerized DBs in a Machine Data environment with Crate.io

Crate.ioFounded in 2013

~25 people and growing

OfficesSan Francisco, Berlin, Dornbirn (AT)

Dockerized1M+ downloads on Docker hub

@claus__m

Page 4: Containerized DBs in a Machine Data environment with Crate.io

Machine Data

@claus__m

Page 5: Containerized DBs in a Machine Data environment with Crate.io

Source: HPE Jun 2016 http://www.slideshare.net/penumuru/harness-the-power-of-big-data-with-oracle-63438438/9

@claus__m

Page 6: Containerized DBs in a Machine Data environment with Crate.io

Machine Data CharacteristicsMillions of data points/second Streaming in from sensors, devices, logs, etc.

Data diversityStructured & unstructured JSON, Blobs

Real-time query performanceMonitoring & alerting

Complex queries of big data volumesWith Terabytes of historic data

GrowthAdding sources often means exponential growth @claus__m

Page 7: Containerized DBs in a Machine Data environment with Crate.io

Machine DataInternet of ThingsSensors, cameras, ...

Wearables, GadgetsLocation data, interaction data, ...

Logs & Monitoring dataComponent health monitoring, access logs, ...

Industry 4.0, DigitizationProduction line insights, automation, ...

VehiclesLocation data, health data, ...

@claus__m

Page 8: Containerized DBs in a Machine Data environment with Crate.io

Clickdrive.ioFleet management & vehicle trackingVehicle health and tracking data

2,000 data points per car, per second

In-depth & real-time analysis

● Predictive maintenance ● Fix cars before catastrophe strikes● Route & driver efficiency● Accident reconstruction

@claus__m

Page 9: Containerized DBs in a Machine Data environment with Crate.io

RoomonitorSmart apartments for property managers

● Climate monitoring & control● Occupancy● Noise level● Access monitoring & control

Alerts

● AC is on with window open!● Make rental properties more efficient,

safe, profitable

@claus__m

Page 10: Containerized DBs in a Machine Data environment with Crate.io

Skyhigh NetworksCloud access security broker (CASB)

● Logging access data from cloud services● Billions of accesses per day from 600+

customers, 10s of thousands of concurrent TCP connections

Machine data is the fingerprint of fraud

● Run AI algorithms to define “normal”● Flag deviations/anomalies in real-time

@claus__m

Page 11: Containerized DBs in a Machine Data environment with Crate.io

ArchitectureFor Machine Data

@claus__m

Page 12: Containerized DBs in a Machine Data environment with Crate.io

MicroservicesContainersIsolation by default

FlexibilityBuilding blocks

Horizontally scalableMostly

Stateful containersDatabases?

@claus__m

Page 13: Containerized DBs in a Machine Data environment with Crate.io

An Open StackExample

@claus__m

Page 14: Containerized DBs in a Machine Data environment with Crate.io

Consumer

Visualizer

An Example

Sensor

SensorProduces data

ConsumerReceives and enriches data

VisualizerDraws stuff

@claus__m

Page 15: Containerized DBs in a Machine Data environment with Crate.io

LOAD BALANCER

V

C

Deploy!S

U

SS

C

V

C

V

Load balancerFor TLS, reverse proxying, load balancing

High availability3 instances

A few sensorsOne user to actually use it

@claus__m

Page 16: Containerized DBs in a Machine Data environment with Crate.io

Go Live

More users!More sensors and users

Data storageSlow and fast

Monitoring & AnalyticsTwo different subsystems

LOAD BALANCER

V

C

S

S

U

S SU

NoSQL DBMessage Queue

SQL DB

US

S

C

V

C

MO

NITO

RING

V

S

AN

ALY

TICS

@claus__m

Page 17: Containerized DBs in a Machine Data environment with Crate.io

But ...

Even more users?Horizontal scaling?

Maintenance & bug hunting?Mostly via scheduled downtimes

Reporting?Kafka? Elasticsearch?

Security?Access control?

Expertise?Knowledge transfer?

LOAD BALANCER

V

C

S

U

S SU

NoSQL DBMessage Queue

SQL DB

US

S

C

V

C

MO

NITO

RING

V

S

AN

ALY

TICS

S

@claus__m

Page 18: Containerized DBs in a Machine Data environment with Crate.io

A Database for MicroservicesShared nothing

Replication

Sane defaults

Resilient

Cross-functional

@claus__m

Page 19: Containerized DBs in a Machine Data environment with Crate.io

Another DB?Yay!

@claus__m

Page 20: Containerized DBs in a Machine Data environment with Crate.io

Yay!

…which one though?

@claus__m

Page 21: Containerized DBs in a Machine Data environment with Crate.io

CrateDBgithub.com/crate/crate

hub.docker.com/r/_/crate

@claus__m

Page 22: Containerized DBs in a Machine Data environment with Crate.io

Yay!

@claus__m

Page 23: Containerized DBs in a Machine Data environment with Crate.io

CrateDBShared nothing

Partitioning & auto-sharding

Replication

(Almost) Zero config

Multi model: Structured & unstructured

SQL

@claus__m

Page 24: Containerized DBs in a Machine Data environment with Crate.io

CrateDB FundamentalsDisk-based index with in-memory cachingFast and efficient OS caching

Shards: Units of dataConcurrency by distributing shards

Distributed query execution engine“Push down” queries

@claus__m

Page 25: Containerized DBs in a Machine Data environment with Crate.io

Postgres Wire Protocol

Presto Parser

Distributed Query Planner

Query Execution Engine

Elasticsearch

Lucene

CLIENT

@claus__m

Page 26: Containerized DBs in a Machine Data environment with Crate.io

A better setup!Horizontal scalabilityScale out everything

Reduced tech stackGet to know it quicker

Live reportingUse ad-hoc queries on production data

FlexibilitySchema Evolution not required @claus__m

LOAD BALANCER

V

C

SS

U

S SU

US

S

C

V

C

MO

NITO

RING

V

S

AN

ALY

TICS

Page 27: Containerized DBs in a Machine Data environment with Crate.io

A better setup!No single point of failureAs highly available as your service

Reduced network trafficBetter reliability

No queueWork with real data

DB isolationAccessible onlyfrom the host

@claus__m

LOAD BALANCER

V

C

SS

U

S SU

US

S

C

V

C

MO

NITO

RING

V

S

AN

ALY

TICS

Page 28: Containerized DBs in a Machine Data environment with Crate.io

Demo Time!

@claus__m

Page 29: Containerized DBs in a Machine Data environment with Crate.io

An Open Stack for Machine DataAd-hoc analysis with SQLInstant reporting on production data

Integrates wellLegacy SQL applications included

Horizontally scalableContainer native, highly availability

@claus__m

Page 30: Containerized DBs in a Machine Data environment with Crate.io

Thanks!

Links

https://github.com/celaus

https://github.com/crate

https://hub.docker.com/r/_/crate

https://crate.io

@crateio

@claus__m