Upload
drew-collingwood
View
218
Download
3
Tags:
Embed Size (px)
Citation preview
Computer Laboratory
Virtualizing the Data Center with Xen
Steve HandUniversity of Cambridge
and XenSource
What is Xen Anyway?
• Open source hypervisor– run multiple OSes on one machine– dynamic sizing of virtual machine– much improved manageability
• Pioneered paravirtualization– modify OS kernel to run on Xen– (applications mods not required)– extremely low overhead (~1%)
• Massive development effort– first (open source) release 2003.– today have hundreds of talented
community developers
Problem: Success of Scale-out
“OS+app per server” provisioning leads to server sprawl
Server utilization rates <10%
Expensive to maintain, house, power, and cool
Slow to provision, inflexible to change or scale
Poor resilience to failures
Solution: Virtualize With Xen
Consolidation: fewer servers slashes CapEx and OpEx
Higher utilization: make the most of existing investments
“Instant on” provisioning: any app on any server, any time
Robustness to failures and “auto-restart” of VMs on failure
Further Virtualization Benefits
• Separating the OS from the hardware– Users no longer forced to upgrade OS to run on latest
hardware
• Device support is part of the platform– Write one device driver rather than N– Better for system reliability/availability– Faster to get new hardware deployed
• Enables “Virtual Appliances”– Applications encapsulated with their OS– Easy configuration and management
Virtualization Possibilities
• Value-added functionality from outside OS:– Fire-walling / network IDS / “inverse firewall”– VPN tunnelling; LAN authentication– Virus, mal-ware and exploit scanning– OS patch-level status monitoring– Performance monitoring and instrumentation– Storage backup and snapshots – Local disk as just a cache for network storage– Carry your world on a USB stick– Multi-level secure systems
This talk
• Introduction
• Xen 3– para-virtualization, I/O architecture– hardware virtualization (today + tomorrow)
• XenEnterprise: overview and performance
• Case Study: Live Migration
• Outlook
Xen 3.0 (5th Dec 2005)
• Secure isolation between VMs• Resource control and QoS• x86 32/PAE36/64 plus HVM; IA64, Power• PV guest kernel needs to be ported
– User-level apps and libraries run unmodified
• Execution performance close to native• Broad (linux) hardware support• Live Relocation of VMs between Xen nodes• Latest stable release is 3.1 (May 2007)
Xen 3.x Architecture
Event Channel Virtual MMUVirtual CPU Control IF
Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)
NativeDeviceDrivers
GuestOS(XenLinux)
Device Manager & Control s/w
VM0
GuestOS(XenLinux)
UnmodifiedUser
Software
VM1
Front-EndDevice Drivers
GuestOS(XenLinux)
UnmodifiedUser
Software
VM2
Front-EndDevice Drivers
UnmodifiedGuestOS(WinXP))
UnmodifiedUser
Software
VM3
Safe HW IF
Xen Virtual Machine Monitor
Back-End
HVM
x86_32x86_64
IA64
AGPACPIPCI
SMP
Front-EndDevice Drivers
Para-Virtualization in Xen
• Xen extensions to x86 arch – Like x86, but Xen invoked for privileged ops– Avoids binary rewriting– Minimize number of privilege transitions into Xen– Modifications relatively simple and self-contained
• Modify kernel to understand virtualised env.– Wall-clock time vs. virtual processor time
• Desire both types of alarm timer
– Expose real resource availability• Enables OS to optimise its own behaviour
Para-Virtualizing the MMU
• Guest OSes allocate and manage own PTs– Hypercall to change PT base
• Xen must validate PT updates before use– Allows incremental updates, avoids revalidation
• Validation rules applied to each PTE:– 1. Guest may only map pages it owns*– 2. Pagetable pages may only be mapped RO
• Xen traps PTE updates and emulates, or ‘unhooks’ PTE page for bulk updates
SMP Guest Kernels
• Xen 3.x supports multiple VCPUs– Virtual IPI’s sent via Xen event channels– Currently up to 32 VCPUs supported
• Simple hotplug/unplug of VCPUs– From within VM or via control tools– Optimize one active VCPU case by binary patching
spinlocks
• NB: Many applications exhibit poor SMP scalability – often better off running multiple instances each in their own OS
Performance: SPECJBB
7940
7960
7980
8000
8020
8040
8060
8080
Native Xen
7940
7960
7980
8000
8020
8040
8060
8080
Native Xen
Average 0.75% overheadAverage 0.75% overhead
Native
3 GHz Xeon 1GB memory / guest2.6.9.EL vs. 2.6.9.EL-xen3 GHz Xeon 1GB memory / guest2.6.9.EL vs. 2.6.9.EL-xen
Performance: dbench
0
50
100
150
200
250
300
1 2 4 8 16 32# CPUs
Native
Xen
< 5% Overhead up to 8 way SMP< 5% Overhead up to 8 way SMP
Kernel build
0
50
100
150
200
250
300
350
1 2 4 6 8
Tim
e (
s)
NativeXen
32b PAE; Parallel make, 4 processes per CPU
5%
8%
13%18% 23%
Source: XenSource, Inc: 10/06# Virtual CPUs
I/O Architecture
• Xen IO-Spaces delegate guest OSes protected access to specified h/w devices– Virtual PCI configuration space– Virtual interrupts– (Need IOMMU for full DMA protection)
• Devices are virtualised and exported to other VMs via Device Channels– Safe asynchronous shared memory transport– ‘Backend’ drivers export to ‘frontend’ drivers– Net: use normal bridging, routing, iptables– Block: export any blk dev e.g. sda4,loop0,vg3
• (Infiniband / Smart NICs for direct guest IO)
Isolated Driver Domains
Event Channel Virtual MMUVirtual CPU Control IF
Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)
NativeDeviceDriver
GuestOS(XenLinux)
Device Manager & Control s/w
VM0
NativeDeviceDriver
GuestOS(XenLinux)
VM1
Front-EndDevice Drivers
GuestOS(XenLinux)
UnmodifiedUser
Software
VM2
Front-EndDevice Drivers
GuestOS(XenBSD)
UnmodifiedUser
Software
VM3
Safe HW IF
Xen Virtual Machine Monitor
Back-End Back-End
DriverDomain
Isolated Driver VMs
• Run device drivers in separate domains
• Detect failure e.g.– Illegal access– Timeout
• Kill domain, restart• E.g. 275ms outage
from failed Ethernet driver 0
50
100
150
200
250
300
350
0 5 10 15 20 25 30 35 40time (s)
Hardware Virtualization (1)
• Paravirtualization…– has fundamental benefits… (c/f MS Viridian) – but is limited to OSes with PV kernels.
• Recently seen new CPUs from Intel, AMD– enable safe trapping of ‘difficult’ instructions– provide additional privilege layers (“rings”)– currently shipping in most modern server,
desktop and notebook systems
• Solves part of the problem, but…
Hardware Virtualization (2)
• CPU is only part of the system– also need to consider memory and I/O
• Memory: – OS wants contiguous physical memory, but Xen needs to share
between many OSes– Need to dynamically translate between guest physical and ‘real’
physical addresses – Use shadow page tables to mirror guest OS page tables (and
implicit ‘no paging’ mode)
• Xen 3.0 includes software shadow page tables; future x86 processors will include hardware support.
Hardware Virtualization (3)
• Finally we need to solve the I/O issue– non-PV OSes don’t know about Xen– hence run ‘standard’ PC ISA/PCI drivers
• Just emulate devices in software?– complex, fragile and non-performant…– … but ok as backstop mechanism.
• Better: – add PV (or “enlightened”) device drivers to OS– well-defined driver model makes this relatively easy– get PV performance benefits for I/O path
Xen 3: Hardware Virtual Machines
• Enable Guest OSes to be run without modification– E.g. legacy Linux, Solaris x86, Windows XP/2003
• CPU provides vmexits for certain privileged instrs• Shadow page tables used to virtualize MMU• Xen provides simple platform emulation
– BIOS, apic, iopaic, rtc, net (pcnet32), IDE emulation
• Install paravirtualized drivers after booting for high-performance IO
• Possibility for CPU and memory paravirtualization– Non-invasive hypervisor hints from OS
NativeDevice Drivers
Co
ntro
l P
anel
(xm/xe
nd
)
Fro
nt en
d
Virtu
al Drivers
Linux xen64
Device
Mo
dels
Guest BIOS
Unmodified OS
Domain N
Linux xen64
Callback / Hypercall VMExit
Virtual Platform
0D
Backen
dV
irtual d
river
Native Device Drivers
Domain 0
Event channel0P
1/3P
3P
I/O: PIT, APIC, PIC, IOAPICProcessor Memory
Control Interface Hypercalls Event Channel Scheduler
Guest BIOS
Unmodified OS
VMExit
Virtual Platform
3D
HVM ArchitectureGuest VM (HVM)
(32-bit mode)Guest VM (HVM)
(64-bit mode)
Progressive Paravirtualization
• Hypercall API available to HVM guests• Selectively add PV extensions to optimize
– Network and Block IO– XenAPIC (event channels)– MMU operations
• multicast TLB flush• PTE updates (faster than page fault)• page sharing
– Time (wall-clock and virtual time)– CPU and memory hotplug
NativeDevice Drivers
Co
ntro
l P
anel
(xm/xe
nd
)
Fro
nt en
d
Virtu
al Drivers
Linux xen64
Device
Mo
dels
Guest BIOS
Unmodified OS
Domain N
Linux xen64
Callback / Hypercall VMExit
Virtual Platform
0D
Backen
dV
irtual d
river
Native Device Drivers
Domain 0
Event channel0P
1/3P
3P
I/O: PIT, APIC, PIC, IOAPICProcessor Memory
Control Interface Hypercalls Event Channel Scheduler
FE
V
irtual
Drivers
Guest BIOS
Unmodified OS
VMExit
Virtual Platform
FE
V
irtual
Drivers
3D
PIC/APIC/IOAPICemulation
Guest VM (HVM)(32-bit mode)
Guest VM (HVM)(64-bit mode)
Xen 3.0.3 Enhanced HVM I/O
0
100
200
300
400
500
600
700
800
900
1000
ioemu PV-on-HVM PV
Mb
/s
rx tx
HVM I/O Performance
Measured with ttcp (1500 byte MTU)
Emulated I/O PV on HVM Pure PV
Source: XenSource, Sep 06
NativeDevice Drivers
Co
ntro
l P
anel
(xm/xe
nd
)
Fro
nt en
d
Virtu
al Drivers
Linux xen64
Guest BIOS
Unmodified OS
Domain N
Linux xen64
Callback / Hypercall
VMExit
Virtual Platform0D
Guest VM (HVM)(32-bit mode)
Backen
dV
irtual d
river
Native Device Drivers
Domain 0
Event channel0P
1/3P
3P
I/O: PIT, APIC, PIC, IOAPICProcessor Memory
Control Interface Hypercalls Event Channel Scheduler
FE
V
irtual
Drivers
Guest BIOS
Unmodified OS
VMExit
Virtual Platform
Guest VM (HVM)(64-bit mode)
FE
V
irtual
Drivers
3D
IO Emulation IO Emulation
Future plan: I/O stub domains.
Device
Mo
dels
PIC/APIC/IOAPICemulation
HW Virtualization Summary
• CPU virtualization available today– lets Xen support legacy/proprietary OSes
• Additional platform protection imminent– protect Xen from IO devices– full IOMMU extensions coming soon
• MMU virtualization also coming soon:– avoid the need for s/w shadow page tables– should improve performance and reduce complexity
• Device virtualization arriving from various folks:– networking already here (ethernet, infiniband)– [remote] storage in the works (NPIV, VSAN)– graphics and other devices sure to follow…
What is XenEnterprise?
• Multi-Operating System– Windows, Linux and Solaris
• Bundled Management Console
• Easy to use– Xen and guest installers and P2V tools
– For standard servers and blades
• High Performance Paravirtualization– Exploits hardware virtualization
– Per guest resource guarantees
• Extensible Architecture– Secure, tiny, low maintenance
– Extensible by ecosystem partners
“Ten minutes to Xen”“Ten minutes to Xen”
Get Started with XenEnterprise
“Ten minutes to Xen”“Ten minutes to Xen”
» Packaged and supported virtualization platform for
x86 servers
» Includes the XenTM Hypervisor
» High performance bare metal virtualization for both
Windows and Linux guests
» Easy to use and install
» Management console included for standard
management of Xen systems and guests
» Microsoft supports Windows customers on
XenEnterprise
» Download free from www.xensource.com and try it
out!
Resource usage statistics Windows 2003 Server Windows XP
Easy VM Creation Clone and Export VMsRed Hat Linux 4
A few XE 3.2 Screenshots…
SPECjbb2005 Sun JVM installed
RHEL5 guest / SPECjbb2005 Sun JVM
0
0.2
0.4
0.6
0.8
1
1.2
2-vcpu 4-vcpuRe
lati
ve
sc
ore
to
na
tiv
e (
hig
he
r is
be
tte
r)
Native ESX 3.0.1 XenEnterprise 3.2
w2k3 SPECcpu2000 integer (multi)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rel
ativ
e sc
ore
to
nat
ive
(hig
her
is
bet
ter)
Native XenEnterprise 3.2 ESX 3.0.1
w2k3 Passmark-CPU results vs native
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Integer Math Floating PointMath
SSE/3DNow! Compression Encryption ImageRotation
String Sorting CPU Mark
Rel
ativ
e sc
ore
to
nat
ive
(hig
her
is
bet
ter)
Native ESX 3.0.1 XenEnterprise 3.2
w2k3 Passmark memory vs native
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Allocate SmallBlock
Read Cached Read Uncached Write Memory Mark
Rel
ativ
e sc
ore
to
nat
ive
(hig
her
is
bet
ter)
Native ESX 3.0.1 XenEnterprise 3.2
Performance: Conclusions
• Xen PV guests typically pay ~1% (1VCPU) or ~2% (2VCPU) compared to native– ESX 3.0.1 pays 12-21% for same benchmark
• XenEnterprise includes:– hardened and optimized OSS Xen– proprietary PV drivers for MS OSes– matches or beats ESX performance in nearly
all cases
• Some room for further improvement…
Live Migration of VMs: Motivation
• VM migration enables:– High-availability
• Machine maintenance
– Load balancing• Statistical multiplexing
gain
Xen
Xen
Assumptions
• Networked storage– NAS: NFS, CIFS– SAN: Fibre Channel– iSCSI, network block dev– drdb network RAID
• Good connectivity– common L2 network– L3 re-routeing
Xen
Xen
Storage
Challenges
• VMs have lots of state in memory
• Some VMs have soft real-time requirements– E.g. web servers, databases, game servers– May be members of a cluster quorum Minimize down-time
• Performing relocation requires resources Bound and control resources used
Stage 0: pre-migration
Stage 1: reservation
Stage 2: iterative pre-copy
Stage 3: stop-and-copy
Stage 4: commitment
Relocation Strategy
VM active on host ADestination host
selected(Block devices
mirrored)Initialize container on target host
Copy dirty pages in successive rounds
Suspend VM on host A
Redirect network traffic
Synch remaining state
Activate on host BVM state on host A
released
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Final
Web Server Migration
Iterative Progress: SPECWeb
52s
Quake 3 Server relocation
Xen 3.1 Highlights
• Released May 18th 2007• Lots of new features:
– 32-on-64 PV guest support• run PAE PV guests on a 64-bit Xen!
– Save/restore/migrate for HVM guests– Dynamic memory control for HVM guests– Blktap copy-on-write support (incl checkpoint)– XenAPI 1.0 support
• XML configuration files for virtual machines• VM life-cycle management operations; and • Secure on- or off-box XML-RPC with many bindings.
Xen 3.x Roadmap
• Continued improved of full-virtualization– HVM (VT/AMD-V) optimizations– DMA protection of Xen, dom0– Optimizations for (software) shadow modes
• Client deployments: support for 3D graphics, etc• Live/continuous checkpoint / rollback
– disaster recovery for the masses
• Better NUMA (memory + scheduling)• Smart I/O enhancements• “XenSE” : Open Trusted Computing
Backed By All Major IT Vendors
* Logos are registered trademarks of their owners
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Summary
Xen is re-shaping the IT industry Commoditize the hypervisor Key to volume adoption of virtualization Paravirtualization in the next release of all OSes
XenSource Delivers Volume Virtualization XenEnterprise shipping now Closely aligned with our ecosystem to deliver full-
featured, open and extensible solutions Partnered with all key OSVs to deliver an
interoperable virtualized infrastructure
Thanks!
• EuroSys 2008
• Venue: Glasgow, Scotland, Apr 2-4 2008
• Important dates: – 14 Sep 2007: submission of abstracts– 21 Sep 2007: submission of papers– 20 Dec 2007: notification to authors
• More details at http://www.dcs.gla.ac.uk/Conferences/EuroSys2008