Resource Aware Scheduling in Apache Storm

RESOURCE AWARE

SCHEDULING IN APACHE STORM

Presented by Boyang Jerry Peng

ABOUT ME• Apache Storm Committer and PMC member• Member of the Yahoo’s low latency Team

Data processing solutions with low latency • Graduate student @ University of Illinois, Urbana-Champaign

Research emphasis in distributed systems and stream processing

• Contact: jerrypeng@yahoo-inc.com

AGENDA•Overview of Apache Storm•Problems and Challenges •Introduction of Resource Aware Scheduler

•Results

OVERVIEW• Apache Storm is an open source distributed real-time data stream

processing platform Real-time analytics Online machine learning Continuous computation Distributed RPC ETL

STORM TOPOLOGY

• Processing can be represented as a directed graph• Spouts are sources of information• Bolts are operators that process data

DEFINITIONS OF STORM TERMS• Stream

an unbounded sequence of tuples.• Component

A processing operator in a Storm topology that is either a Bolt or Spout

• Executors Threads that are spawned in

worker processes that execute the logic of components

• Worker Process A process spawned by Storm that

may run one or more executors.

STORM ARCHITECTURE

Master Node

Cluster Coordinati

onWorker

processes

Worker

Nimbus

Zookeeper

Supervisor

Supervisor Worker

Worker

Launches workers

LOGICAL VS PHYSICAL CONNECTION IN STORM

OVERVIEW OF SCHEDULING IN STORM• Default Scheduling Strategy

Naïve round robin schedulerNaïve load limiter (Worker Slots)

• Multitenant SchedulerDefault Scheduler with multitenant capabilities (supported by security)

Can allocate a set of isolated nodes for topology (Soft Partitioning)

Resource Aware

RUNNING STORM AT YAHOO - CHALLENGES• Increasing heterogeneous clusters

Isolation Scheduler – handing out dedicated machines• Low cluster overall resource utilization

Users not utilizing their isolated allocation very well• Unbalanced resource usage

Some machines not used, others over used• Per topology scheduling strategy

Different topologies have different scheduling needs (e.g. constraint based scheduling)

RUNNING STORM AT YAHOO – SCALE

2012 2013 2014 2015 20160

5001000150020002500300035004000

0100200300400500600700800

Total Nodes Running Storm at YahooTotal Nodes Largest Cluster Size

RESOURCE AWARE SCHEDULING IN STORM• Scheduling in Storm that takes into account resource availability on machines and resource requirement of workloads when scheduling the topology Fine grain resource control Resource Aware Scheduler (RAS) implements this function

- Includes many nice multi-tenant features• Built on top of:

Peng, Boyang, Mohammad Hosseini, Zhihao Hong, Reza Farivar, and Roy Campbell. "R-storm: Resource-aware scheduling in storm." In Proceedings of the 16th Annual Middleware Conference, pp. 149-161. ACM, 2015

RAS API• Fine grain resource control

Allows users to specify resources requirement for each component (Spout or Bolt) in a Storm Topology:

API to set component memory requirement:

API to set component CPU requirement:

Example of Usage:

public T setMemoryLoad(Number onHeap, Number offHeap)

public T setCPULoad(Number amount)

SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);s1.setMemoryLoad(1024.0, 512.0);builder.setBolt("exclaim1", new ExclamationBolt(), 3) .shuffleGrouping("word").setCPULoad(100.0);

CLUSTER CONFIGURATIONSconf/storm.yaml

.supervisor.memory.capacity.mb: 20480.0supervisor.cpu.capacity: 400.0

RAS FEATURES – PLUGGABLE PER TOPOLOGY SCHEDULING STRATEGIES• Allows users to specify which scheduling strategy to use

• Default Strategy- Based on:

• Peng, Boyang, Mohammad Hosseini, Zhihao Hong, Reza Farivar, and Roy Campbell. "R-storm: Resource-aware scheduling in storm." In Proceedings of the 16th Annual Middleware Conference, pp. 149-161. ACM, 2015.

- Enhancements have been made (e.g. limiting max heap size per worker, better rack selection algorithm, etc)- Aims to pack topology as tightly as possible on machines to reduce communication latency and increase

utilization- Collocating components that communication with each other (operator chaining)

• Constraint Based Scheduling Strategy CSP problem solver

conf.setTopologyStrategy(DefaultResourceAwareStrategy.class);

RAS FEATURES – RESOURCE ISOLATION VIA CGROUPS (LINUX PLATFORMS ONLY*)• Replaces resource isolation via isolated nodes• Resource quotas enforced on a per worker basis• Each worker should not go over its allocated resource quota• Guarantee QOS and topology isolation• Documentation:

https://storm.apache.org/releases/2.0.0-SNAPSHOT/cgroups_in_storm.html

*RHEL 7 or higher. Potential critical bugs in older RHEL versions.

RAS FEATURES – PER USER RESOURCE GUARANTEES• Configurable per user resource guarantees

RAS FEATURE – TOPOLOGY PRIORITY• Users can set the priority of a topology to indicate its importance

• The range of topology priorities can range form 0-29. The topologies priorities will be partitioned into several priority levels that may contain a range of priorities

conf.setTopologyPriority(int priority)

PRODUCTION => 0 – 9STAGING => 10 – 19DEV => 20 – 29

RAS FEATURES – PLUGGABLE TOPOLOGY PRIORITY• Topology Priority Strategy

Which topology should be scheduled first? Cluster wide configuration set in storm.yaml Default Topology Priority Strategy

- Takes into account resource guarantees and topology priority- Schedules topologies from users who is the most under his or her

resource guarantee. - Topologies of each user is sorted by priority- More details:

https://storm.apache.org/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.html

RAS FEATURES – PLUGGABLE TOPOLOGY EVICTION STRATEGIES• Topology Eviction Strategy

When there is not enough resource which topology from which user to evict?

Cluster wide configuration set in storm.yaml Default Eviction Strategy

- Based on how much a user’s guarantee has been satisfied- Priority of the topology

FIFO Eviction Strategy- Used on our staging clusters. - Ad hoc use

More details:https://storm.apache.org/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.html

SELECTED RESULTS (THROUGHPUT) FROM PAPER [1] – YAHOO TOPOLOGIES

47% improvement!

50% improvement!

* Figures used [1]

SELECTED RESULTS (THROUGHPUT) FROM PAPER [1] – YAHOO TOPOLOGIES

PRELIMINARY RESULTS IN YAHOO STORM CLUSTERS

CONCLUDING REMARKS AND FUTURE WORK• In Summary

Built resource aware scheduler• Migration Process

In the Progress from migrating from MultitenantScheduler to RAS

Working through bugs with Cgroups, Java, and Linux kernel• Future Work

Improved Scheduling Strategies Real-time resource monitoring Elasticity

QUESTIONS

REFERENCES• [1] Peng, Boyang, Mohammad Hosseini, Zhihao Hong, Reza Farivar, and Roy Campbell.

"R-storm: Resource-aware scheduling in Storm." In Proceedings of the 16th Annual Middleware Conference, pp. 149-161. ACM, 2015.

http://web.engr.illinois.edu/~bpeng/files/r-storm.pdf• [2] Official Resource Aware Scheduler Documentation

https://storm.apache.org/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.htm

• [3] Umbrella Jira for Resource Aware Scheduling in Storm https://issues.apache.org/jira/browse/STORM-893

EXTRA SLIDES

PROBLEM FORMULATION• Targeting 3 types of resources

CPU, Memory, and Network• Limited resource budget for each node • Specific resource needs for each task

Goal:Improve throughput by maximizing utilization and minimizing network

latency

PROBLEM FORMULATION• Set of all tasks Ƭ = {τ1 , τ2, τ3, …}, each task τi has resource demands

CPU requirement of cτi

Network bandwidth requirement of bτi

Memory requirement of mτi

• Set of all nodes N = {θ1 , θ2, θ3, …} Total available CPU budget of W1

Total available Bandwidth budget of W2

Total available Memory budget of W3

PROBLEM FORMULATION

• Qi : Throughput contribution of each node• Assign tasks to a subset of nodes N’ ∈ N that minimizes the total resource waste:

PROBLEM FORMULATION

Quadratic Multiple 3D Knapsack Problem We call it QM3DKP! NP-Hard!

• Compute optimal solutions or approximate solutions may be hard and time consuming• Real time systems need fast scheduling

Re-compute scheduling when failures occur

SOFT CONSTRAINTS VS HARD CONSTRAINTS• Soft Constraints

CPU and Network Resources Graceful performance degradation with over subscription

• Hard Constraints Memory Oversubscribe -> Game over

Your date comes hereYour footer comes here33

OBSERVATIONS ON NETWORK LATENCY1. Inter-rack communication is the slowest2. Inter-node communication is slow3. Inter-process communication is faster4. Intra-process communication is the fastest

HEURISTIC ALGORITHM

• Greedy approach• Designing a 3D resource space

Each resource maps to an axis Can be generalized to nD resource space Trivial overhead!

• Based on: min (Euclidean distance) Satisfy hard constraints

HEURISTIC ALGORITHM

Switch1 2

HEURISTIC ALGORITHM

• Our proposed heuristic algorithm has the following properties:1) Tasks of components that communicate will each other will have the highest priority to be scheduled in close

network proximity to each other. 2) No hard resource constraint is violated.3) Resource waste on nodes are minimized.

Resource Aware Scheduling in Apache Storm

Technology

Apache Storm - cis.csuohio.educis.csuohio.edu/~sschung/cis612/LectureNotes_storm.pdf · Apache Storm Page 1 © Hortonworks Inc. 2013 What is Storm? •Real time stream processing

Hortonworks Data Platform - Apache Storm …€¦ · Hortonworks Data Platform December 15, 2017 2 2. Installing Apache Storm Before installing Storm, ensure that your cluster meets

Amazon Kinesis et Apache Storm...Amazon Web Services – Amazon Kinesis et Apache Storm Octobre 2014 Page 3 sur 18 Résumé Les développeurs d'Apache Storm peuvent utiliser Amazon

How Spotify scales Apache Storm Pipelines

Basic Training: Apache Storm 0dbdmg.polito.it/wordpress/wp-content/uploads/2017/05/02_Storm... · Apache Storm . Storm architecture . ZooKeeper cluster Storm cluster nimbusl zkserverl

Apache Storm 0.9 basic training - Verisign

Resource Aware Scheduling in Apache Storm

Learning Stream Processing with Apache Storm

Apache Storm - Introduction au traitement temps-réel avec Storm

Apache Storm - tutorialspoint.com · Apache Storm 6 Use-Cases of Apache Storm Apache Storm is very famous for real-time big data stream processing. For this reason, most of the companies

R-Storm: Resource-Aware Scheduling in Stormassured-cloud-computing.illinois.edu/files/2014/03/R-Storm... · R-Storm: Resource-Aware Scheduling in Storm Boyang Peng University of Illinois,

Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache Kafka and Apache Zookeeper

Amazon Kinesis and Apache Storm - d0.awsstatic.comd0.awsstatic.com/whitepapers/using-amazon-kinesis-and-apache-sto… · Amazon Kinesis and Apache Storm ... as an in-memory data store

Apache Storm

Apache Storm Internals

Slide #2: Setup Apache Storm

Apache Storm: Hands-on Session

R-Storm: Resource-Aware Scheduling in Storm...for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in in-dustry today

Streaming Apache Storm · 2020-01-06 · 3 5 Apache Storm 6 Storm Concepts §Topology: a graph of computation where the nodes represent some individual computations and the edges

Apache Storm: Hands-on Session - ce.uniroma2.it