View
884
Download
1
Category
Preview:
DESCRIPTION
Pivotal, la plateforme Big Data signé EMC, embarque des technologies pour gérer des requêtes sql en mémoire très performante et pas que ... Présentation de Alexandre Vasseur et Jérôme Campo de Pivotal
Citation preview
A NEW PLATFORM FOR A NEW ERA A NEW PLATFORM FOR A NEW ERA
© Copyright 2013 Pivotal. All rights reserved. © Copyright 2013 Pivotal. All rights reserved. © Copyright 2013 Pivotal. All rights reserved.
SQL et in-memory sur Hadoop avec Pivotal et HAWQ Alexandre Vasseur Jérôme Campo Field Engineering, Pivotal
© Copyright 2013 Pivotal. All rights reserved.
Pivotal Spin off d’EMC et VMware Editeur logiciel Plus de 1250 employés
Data Science Team Pivotal HD
© Copyright 2013 Pivotal. All rights reserved.
Hadoop à 1000 noeuds pour la communauté � 1000 noeuds, 24 000 cores
� 48 TB RAM
� 24 PB (12 000 disques)
� Améliorer Hadoop
� Valider l’éco système Hadoop à l’échelle
http://www.analyticsworkbench.com
© Copyright 2013 Pivotal. All rights reserved.
Pivotal Hadoop
HDFS
HBase
Pig, Hive, Mahout
Map Reduce
Sqoop Flume
Resource Management & Workflow
Yarn
Zookeeper
Apache Pivotal HD Added Value
Configure, Deploy, Monitor, Manage
Command Center
Hadoop Virtualization (HVE)
Data Loader
Pivotal HD Enterprise
Xtension Framework
Catalog Services
Query Optimizer
Dynamic Pipelining
ANSI SQL + Analytics
HAWQ– Advanced Database Services
© Copyright 2013 Pivotal. All rights reserved.
10 ans de R&D sur la base de données massivement parallèle
• Moteur SQL haute performance – Multi-petabyte – ANSI SQL complet – Drivers standardisés et éco-système
• Accès direct aux formats Hadoop – Text, Avro, Hive, HBase, autres formats via API
• Database massivement parrallèle sur Hadoop – Format colonne, compressé, partitionnés, polymorphe – Gestion des priorités et des accès
• In-Database Analytics – Bibliothèques statistiques et machine learning
parrallèlisées – Accessible via R ou SQL
MAD
lib
© Copyright 2013 Pivotal. All rights reserved.
HDFS Datanode
HAWQ Segment Host
HDFS Datanode
HAWQ Segment Host
HDFS Datanode
HAWQ Segment Host
. . . Query Executor Query Executor Query Executor
Clients
JDBC/ODBC
SQL Console
SELECT beer, price FROM Bars b, Sells s WHERE b.name = s.bar AND b.city = ‘San Francisco’
HDFS Namenode
HAWQ Master Host
Query Optimizer
Query Parser
Fonctionnement de HAWQ
© Copyright 2013 Pivotal. All rights reserved.
HDFS Datanode
HAWQ Segment Host
HDFS Datanode
HAWQ Segment Host
HDFS Datanode
HAWQ Segment Host
. . . Query Executor Query Executor Query Executor
Clients
JDBC/ODBC
SQL Console HDFS Namenode
HAWQ Master Host
Query Optimizer
Query Parser
Fonctionnement de HAWQ Execution Plan
ScanBarsb
HashJoinb.name = s.bar
ScanSellss Filterb.city = 'San Francisco'
€
Projects.beer, s.price
MotionGather
MotionRedist(b.name)
© Copyright 2013 Pivotal. All rights reserved.
HDFS Datanode
HAWQ Segment Host
HDFS Datanode
HAWQ Segment Host
HDFS Datanode
HAWQ Segment Host
. . . Query Executor Query Executor Query Executor
Clients
JDBC/ODBC
SQL Console HDFS Namenode
HAWQ Master Host
Query Optimizer
Query Parser
Fonctionnement de HAWQ
ScanBarsb
HashJoinb.name = s.bar
ScanSellss Filterb.city = 'San Francisco'
Projects.beer, s.price
MotionGather
MotionRedist(b.name)
ScanBarsb
HashJoinb.name = s.bar
ScanSellss Filterb.city = 'San Francisco'
Projects.beer, s.price
MotionGather
MotionRedist(b.name)
ScanBarsb
HashJoinb.name = s.bar
ScanSellss Filterb.city = 'San Francisco'
Projects.beer, s.price
MotionGather
MotionRedist(b.name)
© Copyright 2013 Pivotal. All rights reserved.
HDFS Shared Data - HFiles
ICM
10 ans de R&D sur les grilles mémoires NoSQL/NewSQL
Native Persistence
Map-Reduce
I/P & O/P Formatter
Re-evaluate Model
Model Refresh
Online Apps Sensor Data / Feeds
Re-evaluate Model
Model Refresh
HAWQ
GPXF DW
External Tables
Analytic Apps
© Copyright 2013 Pivotal. All rights reserved.
In-memory No/NewSQL sur Hadoop � Bénéfices d’une grille mémoire
– Données en mémoire quand il le faut – Très haute disponibilité, concurrence massive, temps de réponse mémoire
� Intégration native Hadoop – Eviction / stockage sur HDFS natif – Accès à la donnée in-memory ou globale via SQL/NoSQL et HAWQ
© Copyright 2013 Pivotal. All rights reserved.
Tester Pivotal HD Pivotal HD Single Node VM � Hadoop Stack Components – Pig, Hive,
Hbase, HDFS, Mahout, YARN, MRv2
� HAWQ / PXF
� Command Center
� DataLoader
� Eclipse, Maven, Ant
� Retail Data Set
Pivotal HD avec Vagrant � Installation multi VM avec Virtual Box ou
VMware Workstation/Fusion
http://gopivotal.com/pivotal-products/data/pivotal-hd#4 http://blog.gopivotal.com/products/in-45-min-set-up-hadoop-pivotal-hd-on-a-multi-vm-cluster-run-test-data
© Copyright 2013 Pivotal. All rights reserved.
Big/Fast Demo – Big Data Workflow HTTP Pipe Filter HDFS Sink Transform
Tap Logistic Regression Analytic Counter JSON Field
Extract
Tap Analytic Counter JSON Field Extract
MAD
lib
© Copyright 2013 Pivotal. All rights reserved. © Copyright 2013 Pivotal. All rights reserved. © Copyright 2013 Pivotal. All rights reserved.
We’re hiring ! avasseur@gopivotal.com jcampo@gopivotal.com
Merci 14
Recommended