Download pdf - AUTOMATED JVM TUNING WITH BAYESIAN OPTIMIZATION · automated jvm tuning with bayesian optimization. material developed with ian brown @igb kevin swersky @ kswersk jasper snoek @latentjasper

J I A N Q I A O L I U @ T R A C K P O I N T 1 0R A M K I R A M A K R I S H N A @ Y S R 1 7 2 9A L E X W I L T S C H K O @ A W I L T S C H

T W I T T E R E N G I N E E R I N GA D V A N C E D T E C H N O L O G Y G R O U P

A U T O M A T E D J V M T U N I N GW I T H B A Y E S I A N O P T I M I Z A T I O N

M A T E R I A L D E V E L O P E D W I T H

I A N B R O W N @ I G BK E V I N S W E R S K Y @ K S W E R S KJ A S P E R S N O E K @ L A T E N T J A S P E RR Y A N A D A M S @ R Y A N _ P _ A D A M SH U G O L A R O C H E L L E @ H U G O L A R O C H E L L E

D A V E B A R R @ D A V E B A R RJ O H N C O O M E S @ J O H N _ C O O M E SI A N D O W N E S @ N D W N SD A V E R O B I N S O N @ D A V E R O B I N S O NC H R I S R E G A D O @ C H R I S R E G A D OT O D D S T U M P F @ S T U M P F

A C K N O W L E D G E M E N T S

• Jianqao Liu : Graduate Student, ECE at Purdue University (work done while a summer intern at Twitter San Francisco)

• Ramki Ramakrishna : Staff Engineer, JVM Engineering at Twitter San Francisco

• Alex Wiltschko : Research Engineer, Advanced Technology Group at Twitter Boston

JAVAONE 2016

WHO WE ARE

Source: “How we built a metering and chargeback system to incentivize higher resource utilization of Twitter infrastructure”, Micheal Arul, Vinu Charanya, LinuxCon 2016, Toronto, August 22-24, 2016.

• O(103) services• O(105) service instances• Heterogeneous hardware• Varying resources

JAVAONE 2016

TWITTER RUNS ON MICRO-SERVICES

JVM

Hardware

Mesos Container Mesos ContainerKernel + OS Services

JVMMicroservice A Microservice B

h1 , h2 , …

k1 , k2 , …

m1 , m2 , …

j1 , j2 , …

s1 , s2 , …

f(h,k,m,j,s)

JAVAONE 2016

A PERFORMANCE STACK AT TWITTERA simplified view

• Hotspot JVM has hundreds of tunable knobs:$ java -XX:+PrintFlagsFinal -version | grep "=”uintx AdaptiveSizePolicyWeight = 10 {product}uintx AdaptiveSizeThroughPutPolicy = 0 {product}uintx AdaptiveTimeWeight = 25 {product}bool AdjustConcurrency = false {product}bool AggressiveOpts = false {product}intx AliasLevel = 3 {C2 product}bool AlignVector = false {C2 product}…

$ java -XX:+PrintFlagsFinal -version | grep "=” | wc –l757

A large variety of parameters:

• performance-sensitivity• hardware-dependency• mutual (in)dependency

JAVAONE 2016

TUNING AT THE JVM LAYER

Hand-tuning doesn’t scale:• few parameters handled manually• time-consuming, labor-intensive, error-proneCargo-culted configurations

Upgrades make optimality fleeting

Hypothesis: Most micro-services operate below optimalityPreview: 80% improvement on a large service

JAVAONE 2016

PERFORMANCE OPTIMIZATIONNeeds to be continuous

• Given a function f(x1 , x2 , …, xn ) defined over domain X

• Find a configuration A = (a1 , a2 , …, an ) that maximizes f

JAVAONE 2016

PERFORMANCE TUNINGAs a formal optimization problem

• Simple constraints:

x1 < x2 : e.g. NewSize <= HeapSize

a < x3 <= b : e.g. 0 <= MaxTenuringThreshold <= 15

• More complex constraints:

g(x1 , x2 ) <= h(x3, x4 )

• Constraints on behavior:

w(X) < k : e.g. 99 percentile of response latency < 5 ms

r(X) = t(X) : e.g. no requests result in errors

JAVAONE 2016

PERFORMANCE TUNINGAs a constrained optimization problem

Environment may introduce hidden, uncontrollable, and possibly time-varying, parameters, e.g. :• inter-container cross-talk• seasonal/diurnal environmental conditions or load• heterogeneous hardware

JAVAONE 2016

PERFORMANCE TUNINGAs optimization of a noisy, non-stationary cost function

• Design (and refine) a suitable performance metric

• Decide on (and refine) knobs to tune

• Use an iterative strategy to tune these knobs

JAVAONE 2016

PERFORMANCE TUNING

Pick new parameters to testbased on results obtained

Measure performancewith new parameter settings

System being tuned

measureanalyze

JAVAONE 2016

PERFORMANCE TUNINGThe manual approach

Performance Engineer

Pick new parameters to testbased on results obtained

Measure performancewith new parameter settings

System being tuned

“evaluation”“suggestion”

JAVAONE 2016

PERFORMANCE TUNINGUsing an automation assistant

Black Box Tuning

Assistant

A technique from machine learning:Bayesian Optimization

• A machine learning approach to black-box optimization.• A method to learn (potentially noisy) cost functions

• iteratively

• efficiently• Finds very good answers very quickly on a wide variety of problems

I'll show you how it works in practice

JAVAONE 2016

HOW SHOULD WE BUILD AN AUTOMATION ASSISTANT?

Each experiment we run with a different setting of our parameter is expensive

JAVAONE 2016

BAYESIAN OPTIMIZATION

If choosing what experiments to run is important, how do we do it well?

JAVAONE 2016



JAVAONE 2016



JAVAONE 2016


BAYESIAN OPTIMIZATIONIf choosing what experiments to run is important, how do we do it well?

JAVAONE 2016

Expected Improvement

BAYESIAN OPTIMIZATIONBayesOpt in action

Global optimum discovered

JAVAONE 2016

Expected Improvement

BayesOpt works in much higher dimensions than humans do

JAVAONE 2016


What do we want an implementation of BayesOpt to look like in practice?• Easy-to-use• Minimal coding required by the user• Support multiple languages• Running concurrent experiments should be trivial

JAVAONE 2016

BUILDING AN AUTOTUNING SERVICEWhat should an ideal autotuning system look like?

JAVAONE 2016

BUILDING AN AUTOTUNING SERVICEAn example API call

LuaClients

LuaClients

LuaClient

LuaClients

LuaClientsMatlabClient

LuaClients

LuaClientsPythonClient

LuaClients

LuaClients

ScalaClient

WEBSERVER

LuaClients

LuaClients

CLIClient

Middleware

Worker(Queue and State Manager)

BayesOpt Engine

Auto-Scaling Group

Queue

Mesos JAVAONE 2016

BUILDING AN AUTOTUNING SERVICEThe service layout of BayesOpt at Twitter

ALTERNATIVE APPROACHESBayesOpt isn't the only way to do it, but it's by far our favorite

Random Search (Bergstra 2012, shockingly good for zero effort!)Parzen Trees (Bergstra 2011)Random Forests (Hutter et al., 2011)Reinforcement Learning (e.g., Google Datacenter Cooling)

We prefer BayesOpt because it's• Robust• Extensible• Battle-tested on many types of real-world, high-impact

problems.

JAVAONE 2016

BAYESOPT WINS AT TWITTERWe're just getting started

Spam detection (+8%)Abuse Detection (+6%)All deep learning applications ("set it and forget it" prototyping)Vine video recommendations (+30% user engagement on recs)Hadoop cost reduction (-80% cost)Revenue applicationsJVM performance (take it away, Jianqiao!)

JAVAONE 2016

Garbage Collector TypeNew Generation SizeSurvivor RatioParallel GC ThreadsConcurrent GC ThreadsPre-fetch Interval SizeClip In-liningBiased LockingThere are dozens more

JAVAONE 2016

A SAMPLING OF JVM PARAMETERS

J1, J2, … , Jn

F(J)

Kernel

JVM

ServiceS1, S2, … , Sm

K1, K2, … , Ko

HardwareH1, H2, … , Hp

F(J,S,K,H,…)

JAVAONE 2016

A TYPICAL MICRO-SERVICE STACK

J1, J2, … , Jn

F(J)

Kernel

JVM

SPECjbb2015

JAVAONE 2016

SPECJBB2015A JVM benchmark

J1, J2, … , Jn

F(J)

Kernel

JVM

Microservice A

JAVAONE 2016

A MICROSERVICE ON THE JVM

• A large production service

• Access to User Objects via a Thrift interface

• Why this service?

• Mature, does not undergo frequent redeploys

• Large number of service instances

JAVAONE 2016

MICROSERVICE "A"

• Environment• Microservice staging environment• Real production traffic (dark read)• Portion of production workload

• Performance metric• RPS: requests per second• GC_cost: wall-clock time spent in gc

• Perf = RPS / GC_Cost

JAVAONE 2016

THE SET-UP

Scor

era

tio

Version Controlled File

StoreService

BayesOptService

Shard #1

Shard #0

JVM TuningService

Aurora Scheduler

Shard #4

Observability/Metrics Service

Mesos

Restart

Stop

Sug

ges

tion

Baseline and

Experim

ent scores

Con

fig

Shard #3

Shard #2

1. Get a new parm suggestion from BayesOpt

2. Generate JVM configuration

3. Upload new configuration to File Store Service

4. Get baseline platform information

5. Stop-and-restart test instance on specific hardware

6. Test for a fixed duration

7. Get valid baselines

8. Obtain baseline performance score

9. Obtain performance score from experiment

10. Compute the ratio and inform BayesOpt

Microservice A

33

0

0.5

1

1.5

2

2.5

0 10 20 30 40 50 60 70 80 90Iterations

Per

form

ance

sco

re r

atio

JAVAONE 2016

EVALUATIONBayesOpt in action

PERFORMANCE OF THE OPTIMUM RESULT

Performance ratio: 590.253 / 324.237 = 182.0%

JAVAONE 2016

REQUESTS PER SECONDOf the optimum BayesOpt result

JAVAONE 2016

GC COSTOf the optimum BayesOpt result

GC_cost ratio: 15.3/ 27.9 = 54.8%, reduce by 45%

JAVAONE 2016

Apples-to-apples comparison:• Over 50 different platforms in Mesos; ~12 major platforms

System services or Python bugs could cause failure:• timeout (subprocess)• return empty content• throw exceptions

Mesos/Aurora scheduler might interrupt & restart an instance

Authentication service ticket timeout

JAVAONE 2016

PRACTICAL EFFORT

• Several concurrent evaluations• Trade off with longer experiment duration• Experiment set per hardware platform• Terminate obviously poor suggestions early

• Stress test/validation of optimal configuration(s)

• General framework/service for optimizing an arbitrary micro-service

• Clean up service authentication and use more robust, official APIs• Replace Python subprocess calls with direct calls to Python APIs

JAVAONE 2016

NEXT STEPS

Optimal Parameter Settings:• Similar new generation size• Smaller tenuring threshold, smaller survivor spaces• Larger prefetch read interval for GC scan• Old generation promotion allocator filter parameters• More GC threads• Higher compilation size threshold

Performance Gains:• GC overhead• Tail response latency• Data center footprint

JAVAONE 2016

LESSONS FROM OPTIMAL RESULT

• Choice of performance function

• Choice of parameters to tune

• Duration of evaluation runs

• Concurrent evaluations

• Factor out hardware effects

• Protect against noise

• Use baseline configurations

• Long range effects

• Stress-testing to filter optima

JAVAONE 2016

MORE LESSONS

• BayesOpt suggestions/evaluations may be sub-optimal• Reliability and redundancy designed into micro-services

architecture• Existing monitors/alarms/alerts/sensors/telemetry

JAVAONE 2016

AUTOMATED PERFORMANCE TUNINGLeverage existing micro-services infrastructure

• Continuous, inexpensive automated optimization of micro-services is possible, even inevitable

• BayesOpt reduces the number of costly experiments to quickly find a near-optimal setting

• Existing micro-services and DevOps frameworks already have most of the infrastructure to support this

JAVAONE 2016

CONCLUSIONAutomated performance optimization in the DevOps deployment workflow

QUESTIONS ?

THANK YOU !