1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN: Past, Present and FutureMelbourne, Aug.31 2016Junping Du
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Who.JSON
{
"name" : "Junping Du" ,
"job_title" : "Lead Software Engineer @ Hortonworks YARN core team",
"experiences" : [ {
"software_industry_years" : 10,
"hadoop_experience" : "Hadoop contributor before YARN comes out, Apache Hadoop committer & PMC, Release Manager for Apache Hadoop 2.6",
”non_hadoop_experience" : “Architect in cloud computing and enterprise software"
}],
"email" : "[email protected]"
}
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What is Apache Hadoop YARN ?
⬢ YARN is short for “Yet Another Resource Negotiator”
⬢ Big Data Operating System–Resource Management and Scheduling–Support for “colorful” applications, like: Batch, Interactive, Real-Time, etc.
⬢ Enterprise adoption accelerating–Secure mode becoming more widespread–Multi-tenant support–Diverse workloads
⬢ SLAs–Tolerance for slow running jobs decreasing–Consistent performance desired
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Past
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
A brief Timeline
1st line of Code Open sourced First 2.0 alpha First 2.0 beta
June-July 2010 August 2011 May 2012 August 2013
⬢ Sub-project of Apache Hadoop⬢ Releases tied to Hadoop releases⬢ Alphas and betas
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GA Releases
2.2 2.3 2.4 2.5
Oct. 2013 Feb. 2014 Apr. 2014 Aug. 2014
• 1st GA
• MR binary compatibility
• YARN API cleanup
• Testing!
• 1st Post GA
• Bug fixes
• Alpha features- Load simulator- LCE
enhancements
• RM Fail-over
• CS Preemption
• Timeline Service V1
• Writable REST APIs
• Timeline Service V1 security
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GA Releases (Recent + Planning)
2.6 2.7 2.8/2.9 3.0
Nov. 2014 Apr. 2015 2nd H 2016 (estimated) TBD
• KMS• Long running
service support• Rolling
Upgrade• Node Label
Support• Docker
Container
• Pluggable Authorization
• Shared Resource Cache
• Timeline Service V1.5
• Graceful Decommission
• Log CLI Enhancement
• Timeline Service V2
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Outstanding YARN Features released in 2.6/2.7
Default Partition Partition BGPUs
Partition CWindows
JDK 8 JDK 7 JDK 7
Rolling UpgradeNode Label
Pluggable ACLs
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Recent Maintenance Releases Updates
⬢ 2.6 and 2.7 maintenance releases are carried out–Only blockers and critical fixes are added
⬢ Apache Hadoop 2.6–2.6.4 released in Feb. 2016–2.6.3 released in Dec. 2015–2.6.2 released in Oct. 2015
⬢ Apache Hadoop 2.7–2.7.3 released in Aug. 2016–2.7.2 released in Jan. 2016–2.7.1 released in Jul. 2015
10
© Hortonworks Inc. 2011 – 2016. All Rights Reserved10
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Present
11
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN in Modern Data Architecture
⬢Modern Data Architecture –Enable applications to have access to all
your enterprise data through an efficient centralized platform
–Supported with a centralized approach governance, security and operations
–Versatile to handle any applications and datasets no matter the size or type
⬢YARN’s Evolution –The “CORE” of Modern Data
Architecture–Centralized resource management, high
efficient scheduling, flexible resource model, isolation in security and performance, “colorful” applications support, etc.
12
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN
ResourceManager(active)
ResourceManager(standby)
NodeManager1
NodeManager2
NodeManager3
NodeManager4
Resources: 128G, 16 vcoresAuto-calculate node resources
Label: SAS
Dynamically update node resources
13
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
NodeManager Resource Management
⬢ Options to report NM resources based on node hardware–YARN-160–Restart of the NM required to enable feature
⬢ Alternatively, admins can use the rmadmin command to update the node’s resources–YARN-291–Looks at the dynamic-resource.xml–No restart of the NM or the RM required
14
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN Scheduler
Inter queue pre-emptionImprovements to pre-emption
Application
Queue B – 25%
Queue C – 25%
Label: SAS (non-exclusive)
Queue A – 50%
Priority/FIFO, Fair
ResourceManager(active)
Application, Queue A, 4G, 1 vcoreSupport for application priority
Reservation for applicationSupport for cost based placement
agent
User
15
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Capacity scheduler
⬢ Support for application priority within a queue–YARN-1963–Users can specify application priority–Specified as an integer, higher number is higher priority–Application priority can be updated while it’s running
⬢ Improvements to reservations–YARN-2572–Support for cost based placement agent added in addition to greedy
⬢ Queue allocation policy can be switched to fair sharing–YARN-3319–Containers allocated on a fair share basis instead of FIFO
16
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Capacity scheduler
⬢ Support for non-exclusive node labels–YARN-3214–Improvement over partition that existed earlier–Better for cluster utilization
⬢ Improvements to pre-emption
17
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Node 1
NodeManagerSupport added for graceful
decomissioning
128G, 16 vcores
Launch Applicaton 1 AMAM process/Docker container(alpha)
Launch AM process via ContainerExecutor – DCE, LCE, WSCE.
Monitor/isolate memory and cpu. Support added for disk and network
isolation via CGroups(alpha)
Apache Hadoop YARN Application Lifecycle
ResourceManager(active)
Request containers
Allocate containersSupport added to resize containers. Container 1 process/Docker
container(alpha)
Container 2 process/Docker container(alpha)
Launch containers on node using DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network
isolation using Cgroups(alpha).
History Server(ATS 1.5– leveldb + HDFS, JHS - HDFS)
HDFS
Log aggregation
18
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN
⬢ Graceful decommissioning of NodeManagers–YARN-914–Drains a node that’s being decommissioned to allow running containers to finish
⬢ Resource isolation support for disk and network–YARN-2619, YARN-2140–Containers get a fair share of disk and network resources using CGroups–Alpha feature
⬢ Docker support in LinuxContainerExecutor–YARN-3853–Support to launch Docker containers alongside process containers–Alpha feature
19
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop YARN
⬢ Support for container resizing–YARN-1197–Allows applications to change the size of an existing container
⬢ ATS 1.5–YARN-4233–Store timeline events on HDFS–Better scalability and reliability
20
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operational support
⬢ Improvements to existing tools (like yarn logs)
⬢ New tools added (yarn top)
⬢ Improvements to the RM UI to expose more details about running applications
21
© Hortonworks Inc. 2011 – 2016. All Rights Reserved21
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future
22
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Packaging
Containers– Lightweight mechanism for packaging and resource isolation– Popularized and made accessible by Docker– Can replace VMs in some cases– Or more accurately, VMs got used in places where they didn’t
need to be
Native integration ++ in YARN– Support for “Container Runtimes” in LCE: YARN-3611– Process runtime– Docker runtime
23
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
APIs
Applications need simple APIs Need to be deployable “easily”
Simple REST API layer fronting YARN– https://issues.apache.org/jira/browse/YARN-4793– [Umbrella] Simplified API layer for services and beyond
Spawn services & Manage them
24
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN as a Platform
YARN itself is evolving to support services and complex apps– https://issues.apache.org/jira/browse/YARN-4692– [Umbrella] Simplified and first-class support for services in YARN
Scheduling– Application priorities: YARN-1963– Affinity / anti-affinity: YARN-1042– Services as first-class citizens: Preemption, reservations etc
25
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN as a Platform (Contd)
Application & Services upgrades– ”Do an upgrade of my Spark / HBase apps with minimal impact to end-
users”– YARN-4726
Simplified discovery of services via DNS mechanisms: YARN-4757 YARN Federation – to infinity and beyond: YARN-2915
26
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN Service Framework
Platform is only as good as the tools
A native YARN framework– https://issues.apache.org/jira/browse/YARN-4692– [Umbrella] Native YARN framework layer for services and
beyond
Slider supporting a DAG of apps:– https://issues.apache.org/jira/browse/SLIDER-875
27
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Operational and User Experience
Modern YARN web UI - YARN-3368 Enhanced shell interfaces
Metrics: Timeline Service V2 – YARN-2928 Application & Services monitoring, integration with other systems
First class support for YARN hosted services in Ambari– https://issues.apache.org/jira/browse/AMBARI-17353
28
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use-cases.. Assemble!
YARN and Other Platform Services
StorageResource
Management SecurityServiceDiscovery Management
Monitoring
Alerts
Holiday Assembly
HBase
WebServer
IOT Assembly
Kafka Storm HBase Solr
Governance
MR Tez Spark …
29
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Work List (I)⬢ Arbitrary resource types
–YARN-3926–Admins can decide what resource types to
support–Resource types read via a config file
⬢ New scheduler features–YARN-4902–Support richer placement strategies such as
affinity, anti-affinity
⬢ Distributed scheduling–YARN-2877, YARN-4742–NMs run a local scheduler–Allows faster scheduling turnaround
⬢ YARN federation–YARN-2915–Allows YARN to scale out to tens of thousands of
nodes–Cluster of clusters which appear as a single cluster
to an end user
⬢ Better support for disk and network isolation–Tied to supporting arbitrary resource types
30
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Work List (II)
⬢ Simplified and first-class support for services in YARN
–YARN-4692–Container restart (YARN-3988)
•Allow container restart without losing allocation
–Service discovery via DNS (YARN-4757)•Running services can be discovered via DNS
–Allocation re-use (YARN-4726)•Allow AMs to stop a container but not lose resources on the node
⬢ Enhance Docker support–YARN-3611–Support to mount volumes– Isolate containers using CGroups
⬢ ATS v2 Phase 2–YARN-2928 (Phase 1), YARN-5355 (Phase 2)–Run timeline service on Hbase–Support for more data, better performance
⬢ Also in the pipeline–Switch to Java 8 with Hadoop 3.0–Add support for GPU isolation–Better tools to detect limping nodes–New RM UI – YARN-3368
31
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDP Evolution with Apache Hadoop YARN
Beyond2.x1.x
32
© Hortonworks Inc. 2011 – 2016. All Rights Reserved32
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!