Big data architectures

Preview:

DESCRIPTION

Presentation given at @DamnData discussing different architecture types for BigData environments

Citation preview

BigData Architectures

Daan GeritsDasos

Volume

We already have that:

- NAS/SAN- High Performance Computing

IOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOII

Variety

We already have that:

- Meta-modeling- NAS/SAN

IOIIIOII IOII

IOIIIOIIIOII

IOIIIOIIIOII

Velocity

We already have that:

- Complex Event Processing

IOII OOIOII

OIIIOIII

But do you have all of that in 1 platform?

But How??

Architectures

(Thx Nathan Marz!)

Analytical Big Data

Analysis OrientedOptimize

Non-intrusive

Delta

DataSources

DistributedDatabase

Data Systems

AppsDashboards

IngestionEngine

Enrich

Delta

DataSources

DistributedDatabase

Data Systems

AppsDashboards

Flume, Sqoop,

Scribe, ... MR, Pig, Crunch, Mahout, ...

MR, Pig, Crunch, ...

Impala, Hive, ...

Delta

Analytical Big Data architecture for enriching mostly structured data with the goal to

optimize business processes.

Delta

DataSources

DistributedDatabase

Data Systems

AppsDashboards

IngestionEngine

Enrich

Overload!

Delta

Be write-heavy

orread-heavy

NOT both!

Operational Big Data

(Thx Nathan Marz!)

Focussed on Day-to-day business

Innovate(Non-)intrusive

Lambda

DataSources

RealtimeProcessing

FactStore

Batch View A

Batch View B

Batch View C

Realtime View A

Realtime View B

Realtime View C

Just In TimeCombiner

Apps

Dashboard

Reports

Lambda

DataSources

Storm

HDFS

ElephantDB

ElephantDB

ElephantDB

Cassandra*

Cassandra*

Cassandra*

Custom Code*

Apps

Dashboard

Reports

Lambda

Operational Big Data architecture for storing and processing

multi-structured and immutable data with the goal to

Innovate business

Technologies to use

Pick your stack!

Advice

Pilots, PoC, PoT, … do them!Be pragmatic, start skinnyIn Belgium: Variety > VolumeBe prepared to pivot on technologies

Questions?Thoughts?Ideas?Disagreements?...

daan.gerits@dasos.bewww.dasos.be@daangerits

All images are used merely for illustrational means. In no way was it my purpose to violate any rights by using them.

BigData Architectures

Backup Slides

Volume

Variety Velocity

Lambda

Multi-structured

Un-structured

Re-structured

Recommended