20
Learning Cloud Dynamics to Optimize Spot Instance Bidding Strategies Misha Khodak Joint with Liang Zheng, Andrew Lan, Carlee Joe-Wong, and Mung Chiang

Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Learning Cloud Dynamics to Optimize Spot Instance

Bidding StrategiesMisha Khodak

Joint with Liang Zheng, Andrew Lan, Carlee Joe-Wong, and Mung Chiang

Page 2: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Overview• The popularity of cloud computing services has led to the rise of

dual-market pricing schemes:

• Providers sell some instances at a fixed “on-demand” price.

• Excess capacity is sold at a variable “spot-price” determined via an auction.

• We propose a nonlinear dynamical system model to understand spot-price behavior in this environment.

• We verify our model using five months of Amazon EC2 data and demonstrate its potential to inform strategic bidding between heterogeneous cloud resources.

Page 3: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

The Amazon EC2 Spot Market1. Select the type of instance

2. Configure and set a bid price

3. Amazon sets a spot price 4. User receives instance if bid was above spot price

Page 4: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Motivation• Spot price dynamics are poorly understood, with past

economic modeling focusing mainly on global behavior:

• Zheng et al. (SIGCOMM 2015) study the spot price distribution at equilibrium.

• Hoy et al. (WINE 2016) explain the optimality of a two-market design as stemming from variable user risk aversion.

• Understanding temporal dynamics can better inform strategic bidding.

Page 5: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Spot Price Observations

m3.medium spot price over 5 months in 2017

The spot price ⇡t tends

to hover above a constant

lower bound price ⇡.

Sometimes goes above the on-demandprice ⇡. Occurs when on-demand userstake up too much capacity.

Page 6: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Provider Profit Maximizationon-demand instances

spot-market instances

on-demand profit spot-market profit

(⇡ � ⇡)N (d)t + (⇡t � ⇡)N (s)

t

Need a constraint on the number of instances (N):

• Profit-Maximizing: N (d)t +N (s)

t N

• Usage-Maximizing: N (d)t +N (s)

t = N

Page 7: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Maximize Profit or Usage?

Mar. Apr. May Jun. Jul. Aug.

0.10

0.08

0.06

0.04

0.02

0.00

Proposition [KZLJC’18]: If Bt bids are drawn independently from a distri-

bution that weakly stochastically dominates the uniform distribution over [⇡,⇡],then if the provider uses the profit-maximizing constraint we have

P⇣⇡t ⇡(⇢)

⌘ exp

�2

✓1

2

� 2⇢

◆2

Bt

!

for ⇢ 2 [0, 1/4] and ⇡(⇢)= ⇢⇡ + (1� ⇢)⇡.

So a profit-maximizing provider will not set ⇡t close to ⇡ very often, which

contradicts the data and motivates the choice of a usage-maximizing constraint:

Page 8: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Observed Spot Price ModelAt time t cloud provider sees Bt bids, which we model as being i.i.d. draws

from U [⇡,⇡] (Zheng et al., 2015). Then in the limit Bt ! 1 this the spot price

is distributed as

⇡t =

8<

:

⇡ nt + bt 1 (not enough users)

⇡ � (⇡ � ⇡) 1�ntbt

+ "t 0 < 1� nt < bt⇡ nt � 1 (too many on-demand users)

where we define:

nt = N (d)t /N on-demand usage

bt = Bt/N spot usage

"t ⇠ N✓0,

�2

bt

◆observation noise

Page 9: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Job Arrival and DepartureWe model two hidden variables:

1. nt, the number of running on-demand jobs at time t, normalized by N

2. bt, the number of active spot bids at time t, normalized by N

At each time step, ⇤

(d)t on-demand jobs arrive, ⇤

(s)t spot jobs arrive,

˜

(d)t on-

demand jobs complete, and

˜

(s)t spot jobs complete. This yields the dynamical

system

nt+1 = nt + �(d)t � ˜�(d)

t

bt+1 = bt + �(s)t � ˜�(s)

t

for all �t = ⇤t/N modeled as i.i.d. draws from exponential distributions.

Page 10: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Combined ModelOur spot price model is a hidden Markov model (HMM) with hidden statetransition governed by the job arrival/departure model:

nt+1 = nt + �(d)t � �̃(d)

t

bt+1 = bt + �(s)t � �̃(s)

t

and the spot price distribution:

⇡t =

8<

:

⇡ nt + bt 1⇡ � (⇡ � ⇡) 1�nt

bt+ "t 0 < 1� nt < bt

⇡ nt � 1

Five model parameters: a scale parameter for each exponentially-distributed �t

for job arrival/departure and variance �2 of the Gaussian observation noise "t.

observation

hidden state

Page 11: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Parameter Estimation• Model parameters and

hidden states are jointly estimated using Expectation-Maximization (EM).

• The E-step is conducted using a sequential Monte Carlo (“particle filter”) approach: • Better suited better

than Kalman-type filters for non-smooth, singular models.

• Can handle hidden state constraints.

date

dolla

rs

dolla

rsno

. of

req

uest

(pr

op.

of N

)

no.

of r

eque

st (

prop

. of

N)

prediction(95% conf.)

prediction(95% conf.)

1.4

1.2

1.0

0.8

0.6

0.4

0.2

0.0

2.5

2.0

1.5

1.0

0.5

0.0

0.10

0.08

0.06

0.04

0.02

0.00

1.4

1.2

1.0

0.8

0.6

0.4

0.2

0.0

dateFeb. 19 Mar. 19 Apr. 16 Feb. 19 Mar. 19 Apr. 16

dateFeb. 19 Mar. 19 Apr. 16

dateFeb. 19 Mar. 19 Apr. 16

m3.medium, spring 2017

g2.2xlarge, spring 2017

Page 12: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Strategic Bidding• We consider the setting where we want to start a job immediately.

• In the single-instance setting, the optimal strategy is to bid the on-demand price, if one can afford it.

• Instead we can bid within a class of instances by assuming jobs can be easily parallelized.

family of compute-optimized instances

price scales linearly with resources

Page 13: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Choosing Between Instance Families

Given a price ⇡(i)⌧ at time ⌧ for each instance type i in a family I of instances,

we wish to minimize the instance cost:

P (i)⌧ =

t+TiX

t=⌧+1

⇡(i)t

for Ti the amount of time it takes to finish a job on type i.

• Solved by using a spot-price model to find i⌧ = argminE⇡(i)⌧P (i)⌧ .

• In experiments we assume jobs that can be run completely in parallel.

• Can be extended to bidding on non perfectly-parallel jobs and strategic

bidding across geographic regions.

Page 14: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Expected Instance Cost: Leveraging Our Model

• Simulation: use learned parameters to compute the expected instance cost by simulating multiple trajectories. • Requires a lot of computation for high accuracy. • Empirically useful on shorter job lengths.

• Approximation: approximate the expected instance cost using a second-order Taylor expansion. • Cheap to compute. • Empirically useful on longer timescales. • Assumes the job arrival/departure rates are about the same

in both the on-demand and spot market (empirically true).

Page 15: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Expected Instance Cost: Linear Auto-Regression

• Baseline AR(p) model - the price at each time step is some noisy linear combination of the price at p previous time steps.

• Data does not satisfy standard Gaussian error assumptions and uncorrelated residuals.

-0.04 -0.02 0.00 0.02 0.04

0.04

0.02

0.00

-0.02

-0.04

-0.04 -0.02 0.00 0.02 0.04

0.04

0.02

0.00

-0.02

-0.04

-0.005 0.000 0.005

600

500

400

300

200

100

0-0.005 0.000 0.005 0.010

600

500

400

300

200

100

0

AR(1) AR(17)

Page 16: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Evaluating Bidding Strategies

• The performance of each bidding strategy is evaluated using regret: the difference between the cost of the chosen action and that of the best action in hindsight.

• Model-based methods succeed especially well on shorter-term (e.g. 16-hour) jobs.

Reg

ret (

US

Cen

ts)

0

1

2

3

4

Instance Typem3 c3 r3 i3 g2( /20)

Monte CarloApproximationLinear Auto-Regression

Reg

ret (

US

Cen

ts)

0

2.5

5

7.5

10

Instance Typem3 c3 r3 i3 g2( /100)

Monte CarloApproximationLinear Auto-Regression

16 Hours 64 Hours

Page 17: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Performance and Volatility• Model is consistently better than AR on more volatile instances.

• More monetary gain to be had from strategic bidding. • More overlap between the realized cost distributions of

different instances.

0.3 0.4 0.5 0.6

Prob

abili

ty D

ensi

ty

1.0 1.5 2.00.5

Prob

abili

ty D

ensi

ty

0.6 0.8 1.0 1.2 1.4

Prob

abili

ty D

ensi

ty

0.5 1.0 1.5 2.0

Prob

abili

ty D

ensi

ty

0 10 20 30 40 50

Prob

abili

ty D

ensi

ty

Volatile Instances (g2, i3)Non-Volatile Instances (m3, c3, r3)

Payment Distribution for Different Instances

Page 18: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Summary

• We model spot-pricing in cloud computing as a nonlinear dynamical system.

• Amazon EC2 data was used to analyze the problem and learn model parameters.

• We describe strategies for strategic bidding between instances that can make use of the model.

Page 19: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Open Questions• How do we examine the problem in the setting

where dynamics can be influenced across instance families or different regions?

• Provide a model for job departure that explicitly depends on the recent job arrival random variables.

• Devise more sophisticated bidding strategies requiring lighter assumptions concerning job parallelism.

Page 20: Learning Cloud Dynamics to Optimize Spot Instance Bidding ...mkhodak/docs/INFOCOM2018Slides.pdf · Overview • The popularity of cloud computing services has led to the rise of dual-market

Thank you!

Questions?

Contact e-mail: [email protected]