22
Presented by: Michael Cherkassky, Supervised by: Prof. Danny * Real Time Issues in Live migration of Virtual Machines

Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

Embed Size (px)

Citation preview

Page 1: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

Presented by: Michael Cherkassky,

Supervised by: Prof. Danny Raz

*Real Time Issues in Live migration of Virtual Machines

Page 2: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Back to Basics

*Virtual Machine

*Migration

*Live Migration

*Real-Time

“Real Time Issues in Live migration of Virtual Machines”

Page 3: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz
Page 4: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Migration

PH - 1 PH - 2

Page 5: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Migration Method-Naïve

PH - 2PH - 1What was transmitted?

Page 6: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Migration Method-Wise

PH - 1 PH - 2

Page 7: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Live Migration Model

*Goals:

*Identify requirements of live migration process

*Motivation for finding a method for ordering the page transfer

Page 8: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Assumptions

*, where is a memory page of size P

*Available bandwidth for the transfer - b (bps)

*Time needed to transfer a single page - where H is the overhead

*Each page will be accessed for “write” with a probability of during each time frame T

Page 9: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Algorithm

1. At time - is a set of pages to be transmitted. It is set to be the entire page set used by the VM. ;

2. For do: all the pages in are transferred according to the order specified by ; The transfer ends at pages in are found dirty again;

3. Stop the VM and transfer the last pages, up to migration finishing time using a bandwidth of bps, with .

Page 10: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Crucial Values

*Down Time

*Overall Migration time:

Page 11: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Propositions

*The probability of a page that is not dirty at time (start of migration) to become dirty and thus need to be transmitted in the final migration round is:

*The probability of a page that is dirty at time (start of migration) to become dirty and thus need to be transmitted in the final migration round is:

(Where denotes the inverse of the function.)

Page 12: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Overall Migration Time

Page 13: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*The order of transmission of the pages that minimizes the expected number of dirty pages found at the end of the live migration step must satisfy the following condition:*Conclusion: If the probabilities are lower than the

optimum ordering is obtained for increasing values of the probabilities On the other hand, if the probabilities are greater than the optimum ordering is obtained for decreasing values of the .

*Order of Transmission

Page 14: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Is it Right?

Page 15: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

All pages are equal, but some are more equalProblem: wasteful to transmit at each step

Solution: Wait until the end, when VM is down

Algorithm: among the pages that are found as dirty at start of step k, for a set of pages delay the transmission to when the VM is stopped. Which pages?

(where is a threshold value)

*Reducing migration time

Conclusion:It is possible to achieve a negligible increase in down-time with a substantial decrease of overall migration time.

Page 16: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

Problem: Need to know, precisely, for each page .

Solution: Gathering information during run-time.

Problem: Non-negligible overheads.

Solution: LRU

Frequency-based approach

*Down to Earth

Page 17: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Computational Resources:

*Scheduling Guarantee by the Kernel –

(Q, P)

*Network Resources:

*b needs to be constant

*Possible to reserve

*Resources

Page 18: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*The authors have modified the KVM hypervisor.

*Page Tracing mechanism

*Page accessed are traced within the hypervisor, using a bitmap

*The implementation will exploit this information to modify transfer order

*Does it work?

Page 19: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Simulations - Virtualized VideoLAN Client (VLC)

*6500 mapped pages (16KB/page)

*Transfer rate of 100MBit/sec.

*8 sec. (!)

*Does it work?

Page 20: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

Guaranteed bandwidth of 50 Mbit/s

*Standard vs. LRU

*570 -> 300 (47%) (K=1)

*360 -> 290 (19.4%) (K=3)

*4800 -> 4500 (6.25%) (K=1)

*5500 -> 5000 (9.1%) (K=3)

Page 21: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

LRU with delayed transmission

*LRU vs. Improved LRU

*300 -> 220 (27%) (K=3)

*5000 ->4400 (12%) (K=3)

Page 22: Presented by: Michael Cherkassky, Supervised by: Prof. Danny Raz

*Conclusions

*It’s possible to minimize downtime and improve QoS with simple page ordering algorithms

*With a certain bandwidth, LRU has been proved as an effective aid for page ordering, achieving good results.

*Further Work needs to be done…