Upload
inmobi-technology
View
1.194
Download
0
Embed Size (px)
DESCRIPTION
Tez talk on Data Processing over Yarn
Citation preview
Data Processing over YARN
Page 1
© Hortonworks Inc. 2014
Tez – Introduction
• Distributed execution framework targeted towards data-processing applications.
• Based on expressing a computation as a dataflow graph.• Highly customizable to meet a broad spectrum of use cases.• Built on top of YARN – the resource management framework
for Hadoop.• Open source Apache project and Apache licensed.
Page 2
© Hortonworks Inc. 2014
Hadoop 1 -> Hadoop 2
HADOOP 1.0
HDFS(redundant, reliable storage)
MapReduce(cluster resource management
& data processing)
Pig(data flow)
Hive(sql)
Others(cascading)
HDFS2(redundant, reliable storage)
YARN(cluster resource management)
Tez(execution engine)
HADOOP 2.0
Data FlowPig
SQLHive
Others(Cascading)
BatchMapReduce Real Time
Stream Processing
Storm
Online Data
ProcessingHBase,
Accumulo
Monolithic• Resource Management• Execution Engine• User API
Layered• Resource Management – YARN• Execution Engine – Tez• User API – Hive, Pig, Cascading, Your App!
© Hortonworks Inc. 2014
Tez – Problems that it addresses
• Expressing the computation• Direct and elegant representation of the data processing flow
• Performance• Late Binding : Make decisions as late as possible using real data from at
runtime• Leverage the resources of the cluster efficiently• Just work out of the box!• Customizable engine to let applications tailor the job to meet their
specific requirements
• Operation simplicity• Painless to operate, experiment and upgrade
Page 4
© Hortonworks Inc. 2014
Tez – Simplifying Operations
• Tez is a pure YARN application. Easy and safe to try it out!• No deployments to do, no servers to run• Enables running different versions concurrently. Easy to test new
functionality while keeping stable versions for production.• Leverages YARN local resources.
Page 5
ClientMachine
NodeManager
TezTask
NodeManager
TezTaskTezClient
HDFSTez Lib 1 Tez Lib 2
ClientMachine
TezClient
© Hortonworks Inc. 2014
Tez – Expressing the computation
Distributed data processing jobs typically look like DAGs (Directed Acyclic Graph). • Vertices in the graph represent data transformations • Edges represent data movement from producers to consumers
Page 6
Aggregate Stage
Partition Stage
Preprocessor Stage
Sampler
Task-1 Task-2
Task-1 Task-2
Task-1 Task-2
Samples
Ranges
Distributed Sort
© Hortonworks Inc. 2014
MR is a 2-vertex sub-set of Tez
Page 7
© Hortonworks Inc. 2014
But Tez is so much more
Page 8
© Hortonworks Inc. 2014
Tez – Expressing the computation
Tez defines the following APIs to define the work• DAG API
• Defines the structure of the data processing and the relationship between producers and consumers
• Enable definition of complex data flow pipelines using simple graph connection API’s. Tez expands the logical DAG at runtime
• This is how all the tasks in the job get specified
• Runtime API• Defines the interface using which the framework and app code interact
with each other• App code transforms data and moves it between tasks• This is how we specify what actually executes in each task on the cluster
nodes
Page 9
© Hortonworks Inc. 2014
reduce1
map2
reduce2
join1
map1
Scatter_Gather
Bipartite Sequential
Scatter_Gather
Bipartite Sequential
Tez – DAG API
// Define DAGDAG dag = DAG.create(“sessionize”);
// Define Vertex-M1Vertex m1 = Vertex.create("M_1", ProcessorDescriptor.create(MapProcessor_1.class.getName()));
//Define Vertex-R1Vertex r1 = Vertex.create(”R_1", ProcessorDescriptor.create(ReduceProcessor_1.class.getName()));……// Define Edge (edge between m1 and r1)Edge e1 =Edge.create(m1, r1, OrderedPartitionedKVEdgeConfig.newBuilder(…).build() .createDefaultEdgeProperty());
// Connect themdag.addVertex(m1).addVertex(r1).addEdge(e1)…
Page 10
Defines the global processing flow
Tez - Different Edge Properties
11
Scatter-Gather Broadcast One-to-One
© Hortonworks Inc. 2014
Tez – Logical DAG expansion at Runtime
Page 12
reduce1
map2
reduce2
join1
map1
Tez – Runtime API building blocks
13
© Hortonworks Inc. 2014
Tez – Library of Inputs and Outputs
Classical ‘Map’
Page 14
Classical ‘Reduce’
Intermediate ‘Reduce’ for Map-Reduce-Reduce
Map Processor
HDFS Input
Sorted Output
Reduce Processor
Shuffle Input
HDFS Output
Reduce Processor
Shuffle Input
Sorted Output
• What is built in?– Hadoop InputFormat/OutputFormat– OrderedPartitioned Key-Value
Input/Output– UnorderedPartitioned Key-Value
Input/Output– Key-Value Input/Output
© Hortonworks Inc. 2014
Tez – Container Re-Use• Reuse YARN containers/JVMs to launch new tasks• Reduce scheduling and launching delays• Shared in-memory data across tasks• JVM JIT friendly execution
Page 15
YARN Container / JVM
TezTask Host
TezTask1
TezTask2
Sha
red
Obj
ects
YARN Container
Tez Application Master
Start Task
Task Done
Start Task
© Hortonworks Inc. 2014
Container reuse
• Tez specific feature• Run an entire DAG using the same containers• Different vertices use same container• Saves time talking to YARN for new containers
© Hortonworks Inc. 2014
Tez – Sessions
Page 17
Application Master
Client
Start Session
Submit DAG
Task Scheduler
Con
tain
er P
ool
Shared Object
Registry
Pre Warmed
JVM
Sessions• Standard concepts of pre-launch
and pre-warm applied• Key for Interactive queries• Represents a connection between
the user and the cluster• Multiple DAGs/Queries executed in
the same AM• Containers re-used across queries• Takes care of data locality and
releasing resources when idle
© Hortonworks Inc. 2014
Tez – Re-Use in Action (In Session)
Task Execution Timeline
Tez – Auto Reduce Parallelism
© Hortonworks Inc. 2011
19
Tez – Auto Reduce Parallelism
20
© Hortonworks Inc. 2014
Tez – End User Benefits
• Better Performance• Framework performance + application performance
• Better utilization of cluster resources• Efficient use of allocated resources
• Better predictability of results• Minimized queuing delays
• Reduced load on HDFS• Removes unnecessary HDFS writes
• Reduced network usage• Efficient data transfer using new data patterns
• Increased developer productivity• Lets the user concentrate on application logic instead of Hadoop
internals
Page 21
Tez – Real World Examples
© Hortonworks Inc. 2011
22
© Hortonworks Inc. 2014
Tez – Broadcast EdgeSELECT ss.ss_item_sk, avg_price, inv.inv_quantity_on_handFROM (select avg(ss_sales_price) as avg_price, ss_item_sk from store_sales group by ss_item_sk) ssJOIN inventory invON (inv.inv_item_sk = ss.ss_item_sk);
M M M
HDFS
Store Sales scan. Group by and aggregation.
Inventory and Store Sales (aggr.) output scan and shuffle join.
R R
R R
MMM
HDFS
M M M
HDFS
Store Sales scan. Group by and aggregation.
Inventory and Store Sales (aggr.) output scan and shuffle join.
R R
MMM
broadcast
© Hortonworks Inc. 2014
Tez – Multiple OutputsFROM (SELECT * FROM store_sales, date_dim WHERE ss_sold_date_sk = d_date_sk and d_year = 2000)INSERT INTO TABLE t1 SELECT distinct ss_item_skINSERT INTO TABLE t2 SELECT distinct ss_customer_sk;
Hive – MR Hive – Tez
M MM
M
HDFS
Map join date_dim/store sales
Two MR jobs to do the distinct
M MM
M M
HDFS
RR
HDFS
M M M
R
M M M
R
HDFS
Broadcast Join (scan date_dim, join store sales)
Distinct for customer + items
Materialize join on HDFS
Hive : Multi-insert queries
© Hortonworks Inc. 2014
Tez – Data at scale
Page 25
Hive TPC-DS Scale 10TB
© Hortonworks Inc. 2014
Tez – what if you can’t get enough containers?
• 78 vertex + 8374 tasks on 50 YARN containers
Page 26
© Hortonworks Inc. 2014
Tez – Designed for big, busy clusters
• Number of stages in the DAG• Higher the number of stages in the DAG, performance of Tez (over MR) will be
better.• Cluster/queue capacity
• More congested a queue is, the performance of Tez (over MR) will be better due to container reuse.
• Size of intermediate output• More the size of intermediate output, the performance of Tez (over MR) will be
better due to reduced HDFS usage (cross-rack traffic)• Size of data in the job
• For smaller data and more stages, the performance of Tez (over MR) will be better as percentage of launch overhead in the total time is high for smaller jobs.
• Move workloads from gateway boxes to the cluster• Move as much work as possible to the cluster by modelling it via the job DAG.
Exploit the parallelism and resources of the cluster.
Page 27
© Hortonworks Inc. 2014
Tez – Adoption
• Hive• Hadoop standard for declarative access via SQL-like interface
• Pig• Hadoop standard for procedural scripting and pipeline processing
• Cascading• Developer friendly Java API and SDK• Scalding (Scala API on Cascading)
• Commercial Vendors• ETL : Use Tez instead of MR or custom pipelines• Analytics Vendors : Use Tez as a target platform for scaling parallel
analytical tools to large data-sets
Page 28
© Hortonworks Inc. 2014
Tez – Community
• Early adopters and code contributors welcome– Adopters to drive more scenarios. Contributors to make them happen.
• Technical blog series– http://
hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-data-processing
• Useful links– Work tracking: https://issues.apache.org/jira/browse/TEZ– Code: https://github.com/apache/tez– Developer list: [email protected]
User list: [email protected] Issues list: [email protected]
Page 29