Upload
paul-soper
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Network-aware migration control and scheduling of differentiated virtual
machine workloads
Alexander Stage and Thomas SetzerTechnische Universit¨at M¨unchen (TUM)
Chair for Internet-based Information Systems
ICSE Workshop on Software Engineering Challenges in Cloud Computing, Vancouver,
Canada, May 2009
1
Introduction
Server virtualization based workload consolidation is increasingly used.
Raise server utilization levels Ensure cost-efficient data center
operations. Unforeseen spikes or shifts in workloads
require dynamic workload management to avoid server overload.
Continuously align placements of virtual machines (VMs) ----VM Live Migration
2
How does Live Migration Work
Phase 1: Setting Create a TCP connection between source and destination Copy VM’s profile to destination Create a VM on destination
3
.BIN.VSV.XML
.VHD
Source Node(Host A) Destination Node
(Host B)
Network Storage
Configuration Data
How does Live Migration Work
Phase 2: Memory migrate Transfer Memory to destination Trace the difference when transferring Memory Pause the VM on Source Node when starting last transfer
4
.BIN.VSV.XML
.VHD
Source Node Destination Node
Network Storage
Memory Content
How does Live Migration Work
Phase 3: Status migrate Migrate register in VM in Source Node Starting the VM in Destination Node Clean old VM in Source Node
5
.BIN.VSV.XML
.VHD
Source Node Destination Node
Network Storage
Running State
Motivation
VM live migration realizes: Dynamic resource provisioning Load balancing
But it imposes significant overheads that need to be considered and controlled.
CPU overhead [17] Network overhead and network topology
6
[17] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif. Black-box and gray-box strategies for virtual machine migration. In 4th USENIX Symp. on Networked Systems Design and Impl., pages 229 – 242, 2007.
Network overhead of Live Migration(1/2)
In live migration phase 2, it use iterative, bandwidth adapting pre-copy memory page transfer algorithms.
Objective: Minimize VM downtime Keep total migration time low Lower the aggregated bandwidth consumption for
a migration. Non-neglectable network overhead[5]
500 Mb/s for 10 seconds for a trivial web server VM
7
[5]C. Clark, K. Fraser, S. H, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In Proc. of 2nd ACM/USENIX Symp. on Network Systems Design and Implementation, pages 273–286, 2005.
Network overhead of Live Migration(2/2)
Example: Requiring the execution of 20 VM migrations
within 5 minutes. Assume each migration consumes 1 Gb/s for
20 seconds. Sequentially scheduling them over a single
10 (1) Gb/s link saturates the link completely for 40 (400) seconds
Outcome: VMs expose sudden network load increases that would possibly lead to resource shortages.
8
Migration scheduling architecture
In order to deal with the network overhead of live migration, we propose migration scheduling architecture.
9
Data Center
Collect performance parameters
Classify Workload Type, Predict host
utilization
Determines expected resource bottlenecks
and low utilization levels
Handle unexpected situation such as sudden surges in resource demand
Decide operational live migration plan to avoid migration-
related SLA violations
Workload classifier(1/2)
We identify the following main workload attributes for our classification:
Predictability: Predictable means workload behavior can be
reliably forecasted for a given period of time. Forecasting errors are tightly bounded.
Trend: Refers to the degree of upward or downward
leading demand trends. Periodicity:
Indicates the length (time scale) and the power of recurring patterns.
10
Workload classifier(2/2)
For example:1. Predictive, low-variable, low-trend afflicted
workloads: Can be co-hosted more aggressively by exploiting
workload complementarities.
2. Highly non-predictive Require certain buffer capacity on hosts so as to
guarantee overload-avoidance. Note: The implementation of workload
classifier is not the scope of this paper. Supervise target for a period of time Make class-assignment decision
11
Allocation Planner(1/2)
For predictive workload classes Intuition:
Cohosting VMs with complementary workloads High resource utilization can be achieved.
Method: During runtime, use live migration to execute VM
re-allocation plan to optimize the VM allocation. Objective:
Decrease the number of required hosts. High resource utilization can be achieved without
overload.
12
Allocation Planner(2/2)
For non-predictive workload classes Method:
Setting a rather conservative threshold value regarding overall host utilization to avoid overload.
If thresholds are exceeded, one or multiple VMs are selected as migration candidates
Objective: Avoid overload is first priority.
13
Migration Scheduler(1/3) Bandwidth adapting pre-copy memory
page transfer algorithms :1. All main memory pages are transferred2. Only transferred memory pages that have been
written to (dirtied) during the previous iteration. Bandwidth usage is adaptively increased in each iteration
3. If the set of dirtied memory pages is sufficiently small or the upper bandwidth limit is reached then go to step 4.
Otherwise go to 2.
4. The last pre-copy iteration is started. Service downtime
14
iq = the duration of the q-th iteration of VM ibi =constant bandwidth adaptive rate of VM imi = memory size of VM iri = the constant memory dirtying rate of VM i
Only 2 Migration can be launched simultaneously
(D is Rejected)
Migration Scheduler(2/3)
Currently, the bandwidth usage cannot be control during migration.
We can only control maximum bandwidth usage level.
15
Deadline:A: t1/t5B: t1/t6C: ignored/ignoredD: t2/t5
Migration Scheduler(3/3)
Migration scheduler should exercise the control of migration bandwidth usage.
16
3 Migration can be launched simultaneously
Deadline:A: t1/t5B: t1/t6C: ignored/ignoredD: t2/t5
Schedule plan-Offline scheduling Plan(1/2)
Assumption 1: A fixed available bandwidth on each link is
reserved for VM migrations We allow for different amounts of reservations on
different links Offline scheduling can be used for
predictive VM workload clusters with periodicity or for clusters with trend.
Objective Avoid the risk of overloading network links by
migration-related bandwidth consumption
17
Schedule plan-Offline scheduling Plan(2/2)
Without assumption 1: Objective
Minimize the migration-related risk of network congestions with respect to bandwidth demand fluctuations.
Since available bandwidth is not known exactly in advance
Solution: Predict the average utilization of network links for all
time slots (e.g. via the Network Weather Service [16]) Constantly adjust the bandwidth usable for migrations
to meet bandwidth utilization. A more conservative available-bandwidth prediction is
advisable
18
[16] R. Wolski. Dynamically forecasting network performance using the network weather service. Journal of Cluster Computing, 1(119-132), 1998.
Schedule plan-Online scheduling Plan(1/2)
Characteristics Undefined sequence. Migrations can be delayed as long as
migration-finishing deadlines reached. A migration might be rejected in case it
can not be executed
19
Schedule plan-Online scheduling Plan(2/2)
Solution: Emergency migrations may temporarily
supersede bandwidth allocations of lower priority migrations as Figure 3.
The prioritization problem in network revenue management is similar to this issue.
20
Figure 3
Summary
In this paper we propose: Network topology aware scheduling
models for VM live migrations Taking explicitly bandwidth requirements and
the network topology into account A scheme for classifying VM workloads.
Future work: In co-operation with a commercial data
center operator we are currently implementing the proposed architecture.
21
Comment
Good point to consider bandwidth management of Live migration.
But no arithmetic model for Migration schedule.
The model of prediction is simple and non-practical.
How to predict workload is my way to do deep research.
22