Alexander Stage and Thomas Setzer Technische Universit¨at M¨unchen (TUM) Chair for Internet-based Information Systems ICSE Workshop on Software Engineering

Network-aware migration control and scheduling of differentiated virtual

machine workloads

Alexander Stage and Thomas SetzerTechnische Universit¨at M¨unchen (TUM)

Chair for Internet-based Information Systems

ICSE Workshop on Software Engineering Challenges in Cloud Computing, Vancouver,

Canada, May 2009

1

Introduction

Server virtualization based workload consolidation is increasingly used.

Raise server utilization levels Ensure cost-efficient data center

operations. Unforeseen spikes or shifts in workloads

require dynamic workload management to avoid server overload.

Continuously align placements of virtual machines (VMs) ----VM Live Migration

2

How does Live Migration Work

Phase 1: Setting Create a TCP connection between source and destination Copy VM’s profile to destination Create a VM on destination

3

.BIN.VSV.XML

.VHD

Source Node(Host A) Destination Node

(Host B)

Network Storage

Configuration Data


Phase 2: Memory migrate Transfer Memory to destination Trace the difference when transferring Memory Pause the VM on Source Node when starting last transfer

4

.BIN.VSV.XML

.VHD

Source Node Destination Node

Network Storage

Memory Content


Phase 3: Status migrate Migrate register in VM in Source Node Starting the VM in Destination Node Clean old VM in Source Node

5

.BIN.VSV.XML

.VHD

Source Node Destination Node

Network Storage

Running State

Motivation

VM live migration realizes: Dynamic resource provisioning Load balancing

But it imposes significant overheads that need to be considered and controlled.

CPU overhead [17] Network overhead and network topology

6

[17] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif. Black-box and gray-box strategies for virtual machine migration. In 4th USENIX Symp. on Networked Systems Design and Impl., pages 229 – 242, 2007.

Network overhead of Live Migration(1/2)

In live migration phase 2, it use iterative, bandwidth adapting pre-copy memory page transfer algorithms.

Objective: Minimize VM downtime Keep total migration time low Lower the aggregated bandwidth consumption for

a migration. Non-neglectable network overhead[5]

500 Mb/s for 10 seconds for a trivial web server VM

7

[5]C. Clark, K. Fraser, S. H, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In Proc. of 2nd ACM/USENIX Symp. on Network Systems Design and Implementation, pages 273–286, 2005.

Network overhead of Live Migration(2/2)

Example: Requiring the execution of 20 VM migrations

within 5 minutes. Assume each migration consumes 1 Gb/s for

20 seconds. Sequentially scheduling them over a single

10 (1) Gb/s link saturates the link completely for 40 (400) seconds

Outcome: VMs expose sudden network load increases that would possibly lead to resource shortages.

8

Migration scheduling architecture

In order to deal with the network overhead of live migration, we propose migration scheduling architecture.

9

Data Center

Collect performance parameters

Classify Workload Type, Predict host

utilization

Determines expected resource bottlenecks

and low utilization levels

Handle unexpected situation such as sudden surges in resource demand

Decide operational live migration plan to avoid migration-

related SLA violations

Workload classifier(1/2)

We identify the following main workload attributes for our classification:

Predictability: Predictable means workload behavior can be

reliably forecasted for a given period of time. Forecasting errors are tightly bounded.

Trend: Refers to the degree of upward or downward

leading demand trends. Periodicity:

Indicates the length (time scale) and the power of recurring patterns.

10

Workload classifier(2/2)

For example:1. Predictive, low-variable, low-trend afflicted

workloads: Can be co-hosted more aggressively by exploiting

workload complementarities.

2. Highly non-predictive Require certain buffer capacity on hosts so as to

guarantee overload-avoidance. Note: The implementation of workload

classifier is not the scope of this paper. Supervise target for a period of time Make class-assignment decision

11

Allocation Planner(1/2)

For predictive workload classes Intuition:

Cohosting VMs with complementary workloads High resource utilization can be achieved.

Method: During runtime, use live migration to execute VM

re-allocation plan to optimize the VM allocation. Objective:

Decrease the number of required hosts. High resource utilization can be achieved without

overload.

12

Allocation Planner(2/2)

For non-predictive workload classes Method:

Setting a rather conservative threshold value regarding overall host utilization to avoid overload.

If thresholds are exceeded, one or multiple VMs are selected as migration candidates

Objective: Avoid overload is first priority.

13

Migration Scheduler(1/3) Bandwidth adapting pre-copy memory

page transfer algorithms :1. All main memory pages are transferred2. Only transferred memory pages that have been

written to (dirtied) during the previous iteration. Bandwidth usage is adaptively increased in each iteration

3. If the set of dirtied memory pages is sufficiently small or the upper bandwidth limit is reached then go to step 4.

Otherwise go to 2.

4. The last pre-copy iteration is started. Service downtime

14

iq = the duration of the q-th iteration of VM ibi =constant bandwidth adaptive rate of VM imi = memory size of VM iri = the constant memory dirtying rate of VM i

Only 2 Migration can be launched simultaneously

(D is Rejected)

Migration Scheduler(2/3)

Currently, the bandwidth usage cannot be control during migration.

We can only control maximum bandwidth usage level.

15

Deadline:A: t1/t5B: t1/t6C: ignored/ignoredD: t2/t5

Migration Scheduler(3/3)

Migration scheduler should exercise the control of migration bandwidth usage.

16

3 Migration can be launched simultaneously

Deadline:A: t1/t5B: t1/t6C: ignored/ignoredD: t2/t5

Schedule plan-Offline scheduling Plan(1/2)

Assumption 1: A fixed available bandwidth on each link is

reserved for VM migrations We allow for different amounts of reservations on

different links Offline scheduling can be used for

predictive VM workload clusters with periodicity or for clusters with trend.

Objective Avoid the risk of overloading network links by

migration-related bandwidth consumption

17

Schedule plan-Offline scheduling Plan(2/2)

Without assumption 1: Objective

Minimize the migration-related risk of network congestions with respect to bandwidth demand fluctuations.

Since available bandwidth is not known exactly in advance

Solution: Predict the average utilization of network links for all

time slots (e.g. via the Network Weather Service [16]) Constantly adjust the bandwidth usable for migrations

to meet bandwidth utilization. A more conservative available-bandwidth prediction is

advisable

18

[16] R. Wolski. Dynamically forecasting network performance using the network weather service. Journal of Cluster Computing, 1(119-132), 1998.

Schedule plan-Online scheduling Plan(1/2)

Characteristics Undefined sequence. Migrations can be delayed as long as

migration-finishing deadlines reached. A migration might be rejected in case it

can not be executed

19

Schedule plan-Online scheduling Plan(2/2)

Solution: Emergency migrations may temporarily

supersede bandwidth allocations of lower priority migrations as Figure 3.

The prioritization problem in network revenue management is similar to this issue.

20

Figure 3

Summary

In this paper we propose: Network topology aware scheduling

models for VM live migrations Taking explicitly bandwidth requirements and

the network topology into account A scheme for classifying VM workloads.

Future work: In co-operation with a commercial data

center operator we are currently implementing the proposed architecture.

21

Comment

Good point to consider bandwidth management of Live migration.

But no arithmetic model for Migration schedule.

The model of prediction is simple and non-practical.

How to predict workload is my way to do deep research.

22

Documents

Alexander Stage and Thomas Setzer Technische Universit¨at M¨unchen (TUM) Chair for Internet-based Information Systems ICSE Workshop on Software Engineering