Upload
nosqlmatters
View
320
Download
0
Tags:
Embed Size (px)
Citation preview
Evolution of Data Architectures: From Hadoop to Data Lake in becoming Data Driven
Alexandre Vasseur, Pivotal @PivotalFrance
© Copyright 2015 Pivotal. All rights reserved.
If you have one thing to do
Store Massive Data Sets
Achieve Continuous Innovation at Scale
Becoming Data Driven with Apps
Data Driven Apps AGILE
DEV & DATA SCIENCE
MODERN, COLLABORATIVE
APP & DEV PLATFORM:
MODERN, CLOUD-ORIENTED
& OPEN
DATA FABRIC: MODERN
CLOUD-ORIENTED & OPEN
© Copyright 2015 Pivotal. All rights reserved.
The Big Data Problem
Fragmentation Contraints Complexity
© Copyright 2015 Pivotal. All rights reserved.
Pivotal + Hortonworks Alliance
• Started July 2014 around Ambari collaboration • Announcing Pivotal Big Data Suite
on Hortonworks Data Platform • Advanced support from world’s leading Hortonworks
support services • Joint engineering efforts and enhanced Pivotal HD
© Copyright 2015 Pivotal. All rights reserved.
ODP - Standardize Hadoop Ecosystem
• Deliver ODP Core to build a versionned, packaged, tested set of Hadoop components.
• Focus on developing a platform, rather than projects • Initial scope on Apache Hadoop
HDFS / MR / Yarn / Ambari
Remove vendors lock-in
Ecosystem Effect
Shorter Innovation Cycles
http://opendataplatform.org
…
© Copyright 2015 Pivotal. All rights reserved.
Open Sourced but not just Hadoop
• Open sourcing all Pivotal Big Data Suite components – Pivotal GemFire - premium in-memory NoSQL database
– Pivotal HAWQ - world’s leading SQL compliant enterprise SQL on Hadoop
– Pivotal Greenplum Database - advanced enterprise MPP analytic database with Hadoop interconnect
– SpringXD - Unified, distributed, and extensible system for data driven application development
© Copyright 2015 Pivotal. All rights reserved.
HAWQ SQL on Hadoop
PROVEN AT SCALE PRODUCTIVE NATIVE on HADOOP / ODP OPEN & EXTENSIBLE
© Copyright 2015 Pivotal. All rights reserved.
HAWQ SQL on Hadoop
10+ years R&D in Massively Parallel SQL SQL engine at peta scale analytics in world’s largest industries Mature cost based query optimizer Full SQL semantics Rich ecosystem of ELT/dataviz/BI & partners PL/*, build in analytics, R native framing All Hadoop formats (gz, Parquet, HAWQ etc) Data node short circuit reads (colocated, not M/R based) Predicate pushdown to Hive, HBase HAWQ PXF: Query federation to NoSQL, DB, etc
© Copyright 2015 Pivotal. All rights reserved.
SpringXD Data from anywhere, to anywhere Real time & batch
Ingest + analytics + jobs orchestration
Developer friendly Built in connectors
With / without Spark
DSL
Your choice of Hadoop Your choice of messaging
Standalone, YARN & outside Hadoop
© Copyright 2015 Pivotal. All rights reserved.
Simplify Data Driven Applications
• PaaS with NoSQL & Big Data choices built-in • Emergence of vertical services: Mobile, IoT, …
Data centric runtimes built in Java/PHP/Node.js/Ruby Python R/Shiny Scala SpringXD
Large choice of data services DB, clustered MySQL etc Memcache, Redis etc GemFire, Cassandra etc Hadoop, GreenPlum etc
Can run virtualized inside PaaS Can run multi-tenant-ified alongside PaaS
© Copyright 2015 Pivotal. All rights reserved.
DEMO
PHD (or any ODP Core-based Hadoop Distribution)
HDFS
HAWQ (SQL on Hadoop)
GreenplumDB (Analytics DW)
GemFire (JSON/Object
in memory data grid)
Redis (Key Value Store)
Rab
bitM
Q
SpringXD (Stream Processing/scoring)
Spr
ingX
D
Clo
ud F
ound
ry D
ata
Ser
vice
s
HBase Hive
PXF (Filtered Pushdown)
Direct Store Federated
GPHDFS
Write behind Persistence
Analytic Apps Online Apps
Pivotal Big Data Suite
Spark
© Copyright 2015 Pivotal. All rights reserved.
The New Data Imperatives
Converged Data & Cloud
Open Data-Driven Apps
A NEW PLATFORM FOR A NEW ERA
Meet us at the booth ! Come to do a “HAWQ in 2 min” lab
Win a Solo2 Beats Headphone !