Software & Services Group
COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service
Eddie Dong, Yunhong Jiang
1
Software & Services Group
Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
Intel may make changes to specifications and product descriptions at any time, without notice.
All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2012 Intel Corporation.
Software & Services Group
Agenda
• Background
• COarse-grain LOck-stepping
• Performance Optimization
• Evaluation
• Summary
3
Software & Services Group
Non-Stop Service with VM Replication
• Typical Non-stop Service Requires
– Expensive hardware for redundancy
– Extensive software customization
• VM Replication: Cheap Application-agnostic Solution
4
Software & Services Group
Existing VM Replication Approaches
• Replication Per Instruction: Lock-stepping
– Execute in parallel for deterministic instructions
– Lock and step for un-deterministic instructions
• Replication Per Epoch: Continuous Checkpoint
– Secondary VM is synchronized with Primary VM per epoch
– Output is buffered within an epoch
5
Software & Services Group
Problems
• Lock-stepping
– Excessive replication overhead
• memory access in an MP-guest is un-deterministic
• Continuous Checkpoint
– Extra network latency
– Excessive VM checkpoint overhead
6
Software & Services Group
Agenda
• Background
• COarse-grain LOck-stepping
• Performance Optimization
• Evaluation
• Summary
7
Software & Services Group
Why COarse-grain LOck-stepping (COLO)
• VM Replication is an overly strong condition
– Why we care about the VM state ?
• The client care about response only
– Can the control failover without ”precise VM state
replication”?
• Coarse-grain lock-stepping VMs
– Secondary VM is a replica, as if it can generate same response with primary so far
• Be able to failover without service stop
8
Non-stop service focus on server response, not internal machine state!
Software & Services Group
How COLO Works
• Response Model for C/S System
– & are the request and the execution result of an un-deterministic instruction
– Each response packet from the equation is a semantics response
• Successfully failover at kth packet if
(C is the packet series the client received)
9
Software & Services Group
Architecture of COLO
10
COarse-grain LOck-stepping Virtual Machine for Non-stop Service
Software & Services Group
Why Better
• Comparing with Continuous VM checkpoint
– No buffering-introduced latency
– Less checkpoint frequency
• On demand vs. periodic
• Comparing with lock-stepping
– Eliminate excessive overhead of un-deterministic instruction execution due to MP-guest memory access
11
Software & Services Group
Agenda
• Background
• COarse-grain LOck-stepping
• Performance Optimization
• Evaluation
• Summary
12
Software & Services Group
Performance Challenges
• Frequency of Checkpoint
– Highly dependent on the Output Similarity, or Response Similarity
• Key Focus is TCP packet!
• Cost of Checkpoint
– Xen/Remus uses passive-checkpoint
• Secondary VM is not resumed until failover Slow path
– COLO implements active-checkpoint
• Secondary VM resumes frequently
13
Software & Services Group
Improving Response Similarity
• Minor Modification to Guest TCP/IP Stack
– Coarse Grain Time Stamp
– Highly-deterministic ACK mechanism
– Coarse Grain Notification Window Size
– Per-Connection Comparison
14
Software & Services Group
Similarities after Optimization
• Web Server • FTP Server
15
0
100
200
300
400
500
600
0
4000
8000
12000
16000
1 2 4 8 16 32
Tim
e (
ms)
Nu
mb
er
of
Pac
kets
Number of Threads
Packets # Duration
0
100
200
300
400
0
1000
2000
3000
4000
PUT GET
Tim
e (
ms)
Pac
kets
#
Packets # Duration
*Run Web Bench in Client
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Reducing the Cost of Active-checkpoint
• Lazy Device State Update
– Lazy network interface up/down
– Lazy event channel up/down
• Fast Path Communication
16
Software & Services Group
Checkpoint Cost with Optimizations
17
Final cost: 74ms/checkpoint: (1/3 on page transmission, 2/3 on suspend/resume)
0
300
600
900
1200
1500
1800
Baseline Lazy Netif Lazy Event Channel Efficient Comm.
Sp
en
t Ti
me
(m
s)
Suspend Resume Netif-Up Netif-Down Mem Xmit
Leave Net UP
Leave EventChannel up
Replace XenStore Access with Eventchannel
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Agenda
• Background
• COarse-grain LOck-stepping
• Performance Optimization
• Evaluation
• Summary
18
Software & Services Group
Configurations
• Hardware
– Intel® Core™ i7 platform, a 2.8 GHz quad-core processor
– 2GB RAM
– Intel® 82576 1Gbps NIC * 2 (internal & external)
• Software
– Xen 4.1
– Domain 0: RHEL5U5
– Guest: 32-bit BusyBox 1.20.0, Linux kernel 2.6.32
• 256MB RAM and uses a ramdisk for storage
19
Software & Services Group
Bandwidth of NetPerf
20
TCP UDP
0
200
400
600
800
1000
54 1500 64K
Ban
dw
idth
(M
b/s
)
Message Size
Native Remus-20ms Remus-40ms COLO
0
200
400
600
800
1000
54 1500 64K
Ban
dw
idth
(M
b/s
)
Message Size
Native Remus-20ms Remus-40ms COLO
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
FTP Server
21
0.5
1
2
4
8
16
32
64
128
PUT GET
Tim
e (s
)
Remus-20ms
Remus-40ms
COLO
Native
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Web Server - Concurrency
22
Run Web Bench in Client
1
10
100
1000
1 2 4 8 16 32
Thro
ugh
pu
t (M
bp
s)
Threads
Native Remus-20ms Remus-40ms COLO
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Web Server - Throughput
23
Run httperf in Client
0
200
400
600
800
1000
1200
100 500 1000 1500
Thro
ugh
pu
t (r
esp
on
se/s
)
Request / second
Native Remus-20ms Remus-40ms COLO
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Latency in Netperf/Ping
24
0.28 0.28 0.38 0.40 0.4 0.55
0
10
20
30
40
Netperf-TCP-RR Netperf-UDP-RR Ping
Ave
rage
Lat
en
cy (
ms)
Native
Remus-20ms
Remus-40ms
COLO
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Web Server - Latency
25
Run httperf in Client
0
200
400
600
800
1000
1200
100 500 1000 1500
Re
spo
nse
Lat
en
cy (
ms)
Request/second
Native Remus-20ms Remus-40ms COLO
For more complete information about performance and benchmark results, visit Performance Test Disclosure
Software & Services Group
Agenda
• Background
• COarse-grain LOck-stepping
• Performance Optimization
• Evaluation
• Summary
26
Software & Services Group
Summary
• COLO is an ideal Application-agnostic Solution for Non-stop service
– Web server: 67% of native performance
– CPU, memory and netperf: near-native performance
• Next steps:
– Merge into Xen
– More optimizations
27
Software & Services Group
28