26
Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen†,Eric Jul†, Christian Limpach, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory † Department of Computer Science ,University of Copenhagen, Denmark USENIX NSDI ‘05

Live Migration of Virtual Machines

  • Upload
    gretel

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Live Migration of Virtual Machines. Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen†,Eric Jul†, Christian Limpach , Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory † Department of Computer Science ,University of Copenhagen, Denmark. - PowerPoint PPT Presentation

Citation preview

Page 1: Live Migration  of  Virtual Machines

Live Migration of Virtual

MachinesChristopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen†,Eric Jul†, Christian Limpach, Ian Pratt, Andrew Warfield

University of Cambridge Computer Laboratory† Department of Computer Science ,University of Copenhagen, Denmark

USENIX NSDI ‘05

Page 2: Live Migration  of  Virtual Machines

IntroductionOperating system virtualization has

attracted considerable interest in recent years-In data Centers, cluster computing communities

allows many OS instances to run concurrently on a single physical machine

Migrating an entire OS and all of its applications as one unit◦ Compared to the process migration (residual

dependencies)

Page 3: Live Migration  of  Virtual Machines

IntroductionLive Migration

Without interfering the network connection

Allows a separation of concerns between the users and operator of a datacenter or cluster.

Allowing separation of hardware and software considerations

Page 4: Live Migration  of  Virtual Machines

IntroductionDowntime

◦services are entirely unavailableTotal migration time

during which state on both machines is synchronized and which hence may affect reliability

This paper use the “pre-copy” approach to achieve live migration and target on decreasing the downtime (implemented on Xen)

Page 5: Live Migration  of  Virtual Machines

DesignNetwork

Generate an ARP reply from the migrated host, advertising that the IP has moved to a new location.

StorageUse a network-attached storage (NAS) deviceDo not need to migrate disk storage

Page 6: Live Migration  of  Virtual Machines

DesignMemory Transfer

◦ Push phase◦ Stop-and-copy phase◦ Pull phase

most practical solutions select one or two of the three phases ◦ pure stop-and-copy, pure demand

This paper uses iterative push phase with a typically very short stop-and-copy phase.

Page 7: Live Migration  of  Virtual Machines

Related WorkShutdown the VMPre-CopyVMware

Page 8: Live Migration  of  Virtual Machines

Related WorkPost-Copy Live Migration of

Virtual Machines Michael R. Hines, Umesh Deshpande, and Kartik

GopalanComputer Science, Binghamton University (SUNY)

ACM SIGPLAN/SIGOPS VEE’09

Page 9: Live Migration  of  Virtual Machines

Design Overview

Page 10: Live Migration  of  Virtual Machines

WritableWorking SetsSome pages will seldom or never be

modified and hence are good candidates for pre-copy

Some will be written often and so should best be transferred via stop-and-copy

=> WritableWorking Sets

Page 11: Live Migration  of  Virtual Machines

WritableWorking Sets

Page 12: Live Migration  of  Virtual Machines

WritableWorking Sets

Page 13: Live Migration  of  Virtual Machines

Dynamic Rate-LimitingDynamically adapt the bandwidth limit

during each pre-copying round

The administrator selects a minimum(m) and a maximum(M) bandwidth limit

The first pre-copy round transfers pages at the minimum bandwidth m

Page 14: Live Migration  of  Virtual Machines

Dynamic Rate-LimitingDirtying rate =

(the number of pages dirtied in the previous round)/ (duration of the previous round)

Bandwidth rate for next round = Dirtying rate + 50 Mbits/sec

Stop pre-copy when◦ Calculated rate > M◦ Less than 256KB remains to be tranferred

Page 15: Live Migration  of  Virtual Machines

Some implementation issuesRapid Page Dirtying

◦Do not need to always transfer hot pages

Freeing Page Cache Pages◦In the first round

Stunning Rogue Processes◦ Limit each process to 40 write faults each time

Page 16: Live Migration  of  Virtual Machines

Stunning Rogue Processes

Page 17: Live Migration  of  Virtual Machines

EvaluationDell PE-2650 server-class

machinesdual Xeon 2GHz CPUs2GB memoryconnected via Gigabit EthernetStorage: iSCSI protocol NASXenLinux 2.4.27

Page 18: Live Migration  of  Virtual Machines

a. SimpleWeb ServerApache 1.3 web serverContinuously serving a single

512KB filememory allocation: 800MBInitially rate limited to

100Mbit/sec776MB memory to be transferred

in the first round165ms outage

Page 19: Live Migration  of  Virtual Machines

a. SimpleWeb Server

Page 20: Live Migration  of  Virtual Machines

b.ComplexWebWorkload:SPECweb99memory allocation: 800MB30% require dynamic content

generation16% are HTTP POST operations0.5% execute a CGI scriptThe server generates access and

POST logs210ms outage

Page 21: Live Migration  of  Virtual Machines

b.ComplexWebWorkload:SPECweb99

Page 22: Live Migration  of  Virtual Machines

c. Low-Latency Server: Quake 3a multiplayer on-line game

servera virtual machine with 64MB of

memorySix players joined the game and

started to play within a shared arena

transfers so little data (148KB) in the last round

Downtime: 60ms

Page 23: Live Migration  of  Virtual Machines

c. Low-Latency Server: Quake 3

Page 24: Live Migration  of  Virtual Machines

d. A DiabolicalWorkload: MMunchera virtual machine is writing to

memory faster than can be transferred

Memory: 512MBa simple C program that writes

constantly to a 256MBDowntime: 3.5 seconds

Page 25: Live Migration  of  Virtual Machines

d. A DiabolicalWorkload: MMuncher

Page 26: Live Migration  of  Virtual Machines

ConclusionA pre-copy live migration method

on XenConcern about WWSDynamic network-bandwidth

adaptionrealistic server workloads such as

SPECweb99 can be migrated with just 210ms downtime

a Quake3 game server is migrated with an imperceptible 60ms outage