Computing FrontiersMay 2005
J. E. Smith
Virtual Machines Virtual Machines Supporting Changing Technology Supporting Changing Technology
and New Applicationsand New Applications
VMs (c) 2005, J. E. Smith 2
IntroductionIntroduction
Why are virtual machines interesting?
They involve computer architecture in a pure sense
They allow transcending of interfaces (which often seem to be an obstacle to innovation)
They enable innovation in flexible, adaptive software & hardware, security, network computing (and others)
Virtualization technologies will be a key part of most future computer systems
VMs (c) 2005, J. E. Smith 3
OutlineOutline
Virtualization The Architecture of Virtual Machines Emulation Enhancing Security The Grid Portable Environments Co-Designed VMs
VMs (c) 2005, J. E. Smith 4
AbstractionAbstraction
Computer systems are built on levels of abstraction
Higher level of abstraction hide details at lower levels
Example: files are an abstraction of a disk
filefile
abstraction
I/O devicesand
Networking
Controllers
System Interconnect(bus)
Controllers
MemoryTranslation
Execution Hardware
DriversMemoryManager
Scheduler
Operating System
Libraries
ApplicationPrograms
MainMemory
Software
Hardware
VMs (c) 2005, J. E. Smith 5
VirtualizationVirtualization
Similar to abstractionExcept• Details not necessarily hidden
Construct Virtual Disks• As files on a larger disk• Map state• Implement functions
VMs: do the same thing with the whole “machine”
file file
virtualization
VMs (c) 2005, J. E. Smith 6
The Family of Virtual MachinesThe Family of Virtual Machines
“The subjects of virtual machines and emulators have been treated as entirely separate. … they have much in common. Not only do the usual implementations have many shared characteristics, but this commonality extends to the theoretical concepts on which they are based”
-- Efrem G. Wallach, 1973
Including things not called “virtual machines” IA-32 EL
HP Dynamo Transmeta Crusoe
There are lots of “virtual machines”IBM VM/370JavaVMware products
VMs (c) 2005, J. E. Smith 7
““Machines”Machines”
Different perspectives on what the Machine is:
OS developer Compiler developer Application programmer
Instruction Set Architecture• ISA• Major division between hardware
and software
Application Binary Interface• ABI• User ISA + OS calls
Application Program Interface• API• User ISA + library calls I/O devices
andNetworking
System Interconnect(bus)
MemoryTranslation
Execution Hardware
ApplicationPrograms
MainMemory
Operating System
Libraries
VMs (c) 2005, J. E. Smith 8
System Virtual MachinesSystem Virtual Machines
Provide a system environment
Constructed at ISA level
Persistent Examples: IBM
VM/360, VMware, Transmeta Crusoe
guestprocess
HOST PLATFORM
virtualnetwork communication
Guest OS
VMM
guestprocess
guestprocess
guestprocess
Guest OS2
VMM
guestprocess
guestprocess
VMs (c) 2005, J. E. Smith 9
Process Virtual MachinesProcess Virtual Machines
Constructed at ABI level Runtime manages guest
process Guest processes may
intermingle with host processes
Not persistent As a practical matter,
guest and host OSes are often the same
Dynamic optimizers are a special case
Examples: IA-32 EL, FX!32, Dynamo
HOST OS
Disk
file sharing
network communication
guestprocess
create
hostprocess
guestprocess
runtimeruntime
guestprocess
runtime
hostprocess
VMs (c) 2005, J. E. Smith 10
High Level Language Virtual MachinesHigh Level Language Virtual Machines
Raise the “ABI” level of abstraction• User higher level virtual ISA• OS abstracted as standard libraries
A form of process VM
HLL Program
Intermediate Code
Memory Image
Object Code(ISA)
Compiler front-end
Compiler back-end
Loader
HLL Program
Portable Code(Virtual ISA )
Host Instructions
Virt. Mem. Image
Compiler
VM loader
VM Interpreter/Translator
Traditional HLL VM
VMs (c) 2005, J. E. Smith 11
The Virtual Machine SpaceThe Virtual Machine Space
Multiprogrammed
Systems
HLL VMsCo-Designed
VMs
same ISAdifferent
ISA
Process VMs System VMs
WholeSystem VMs
differentISA
same ISA
ClassicOS VMs
DynamicBinary
Optimizers
DynamicTranslators
HostedVMs
VMs (c) 2005, J. E. Smith 12
Key Feature – State/Resource MappingKey Feature – State/Resource Mapping
VM SW can Re-map logical to physical state
• Via pointers or copying• Registers to registers• Registers to memory• Memory to disk
Guest Code
Guest Data
RuntimeData
RuntimeCode
Guest Registers
Host Registers
Host ABIAddress Space
HostRegister Space
VMs (c) 2005, J. E. Smith 13
Key Feature – EmulationKey Feature – Emulation
Interpretation• Software loop decodes and dispatches each
instruction Binary translation and code caching
• Translate blocks of instructions at a time• Hold translated blocks in code cache• With same-ISA scanning/patching is an alternative
Staged Emulation• Emulation techniques invoked in staged manner• Based on performance tradeoffs
VMs (c) 2005, J. E. Smith 14
Code CachesCode Caches
Contain• Basic blocks• Superblocks (one entrance, multiple exits)• Optimized Superblocks
A base technology for many VMs• Dynamic binary translators: Intel IA-32 EL, Compaq FX!32• Dynamic binary optimizers: Dynamo family• Co-designed virtual machines: Transmeta, IBM DAISY• High performance Java virtual machines• System VMs with “inefficiently virtualizable” ISAs• “Sandboxing” secure VMs (x86 DynamoRIO)
VMs (c) 2005, J. E. Smith 15
Code Caching with ChainingCode Caching with Chaining
Chaining of blocks in code cache minimizes VM overhead
Superblock
Dispatch table
lookup code
Superblock
Superblock
Superblock
Code Cache
VMs (c) 2005, J. E. Smith 16
Staged EmulationStaged Emulation
An important part of many VM implementations
Translate, optimize & cache frequent code sequences
Binary MemoryImage
Code CacheProfile Data
Interpreter
Translator/Optimizer
runtime
Start interpreting Profile to find “hot” code regions
VMs (c) 2005, J. E. Smith 17
Key Feature – VMM/Runtime ControlKey Feature – VMM/Runtime Control
Interpretation • Fine grain control• Every dynamic instruction “inspected” before execution
Binary translation and code caching• Coarser grain control• Every static instruction inspected before execution• Jumps to VM SW can be inserted anywhere
Protection levels• Very coarse grain control• Every resource-related instruction trapped by protection system
Otherwise, use interpretation/translation techniques• Used in system VMs to manage resource mappings
VMs (c) 2005, J. E. Smith 18
VMM Resource Control in System VMsVMM Resource Control in System VMs
Traps and interrupts (& sys calls)• Transfer to VMM• VMM determines appropriate Guest OS• VMM transfers to Guest OS
Guest OS “return” to user app.• Transfer to VMM• VMM bounces return back to Guest app.
Resource sensitive instructions• Trap to VMM• VMM checks correctness• VMM reads/modifies guest resource• Returns to Guest
privileged operation
next instruction
check privileges
perform operation
return
system call/trap
vector location:
virtual vector location:
Application
Guest OS
VMM
VMs (c) 2005, J. E. Smith 19
VMM as a Smart InterconnectVMM as a Smart Interconnect
Two modes:• Execution mode• VM mode
After it gains control• VM SW can manage resources via state mapping• VM SW can alter/enhance functions via emulation
ISA 1
OS 1
apps 1OS 2
apps 2
OS 1
apps 1
ISA 1
OS 1
apps 1
ISA 1
OS 2
apps 2
VMs (c) 2005, J. E. Smith 20
SecuritySecurity
Many security threats• Worms, viruses, Trojan horses, etc.
Typical attack – get access to privileged part of system• Often with little effort
Compromised passwords
“Easy” passwords
Mechanically repeated efforts• Exploit weakness in system software
Unchecked accesses to system data structures
Can get control in privileged state by causing overflows
VMs (c) 2005, J. E. Smith 21
Buffer OverflowBuffer Overflow
User invokessystem programwith normal input
User Mode Supervisor Mode
System programperforms function
and returns to user
User performssubsequent task
User invokessystem program
with faulty input thatcauses buffer
overflow in stack
Return address in stackclobbered due to
overflow. Vulnerablesystem program peformsfunction and returns to
illegal address
Systemexception!
User Mode Supervisor Mode
(a) Normal Input (b) Faulty Input
VMs (c) 2005, J. E. Smith 22
Malicious Input – IntrusionMalicious Input – Intrusion
User Mode Supervisor Mode
Malicious user invokessystem program with
tailored input that causesbuffer overflow in stack Return address in stack
changed due to overflow.Vulnerable system program
peforms function and returns touser-specified address, e.g.
address of shell program
User gets full control of systemthrough shell program running
in supervisor mode
VMs (c) 2005, J. E. Smith 23
Intrusion Detection SystemsIntrusion Detection Systems
Isolation is not an option• Increasing dependence on communication over networks
Language-level checking• Java, MSIL – range- and type-checking• Legacy applications and legacy style not protected
Need for Intrusion Detection Systems (IDS)• Depend on knowledge of potential attacks• Network-based Intrusion Detection Systems (NIDS)• Host-based Intrusion Detection Systems (HIDS)
VMs (c) 2005, J. E. Smith 24
Host Intrusion Detection SystemsHost Intrusion Detection Systems
Directly examine activity on host• Knowledge of host operating system• Look for repeated attempts
To crack passwordTo access unauthorized files, etc.
HIDS has significantly better viewpoint compared to NIDS
But HIDS can be disabled by attack• Or can provide misleading information
VMs (c) 2005, J. E. Smith 25
Monitoring and Recovering from AttacksMonitoring and Recovering from Attacks
Importance of understanding attacks• To recover from an attack• To prevent future attacks
Logging• Save information about critical activity on system
Know the events that caused the failure• Save checkpoint of state of system
Reconstruct the attack from a known good state
VMs (c) 2005, J. E. Smith 26
Virtual Machines as a SandboxVirtual Machines as a Sandbox
Fault containment important feature of VMs VM Isolation helps in close examination of attack
• Clone system that has been attacked for later analysis Use VM as a “honey-pot”
• Permit attacks that can be monitored
VM1
Hardware
Virtual Machine Monitor
VM2 VM3 VM4
Production Virtual Machines
VMs (c) 2005, J. E. Smith 27
Virtual Machine for MonitoringVirtual Machine for Monitoring
Livewire system (Stanford)• Separates IDS from VMM• IDS configures the VMM to monitor activity at more
than the usual pointsSignature of suspicious activity may be specified
• After initialization, IDS enters the picture only in analyzing data from suspicious activity
• Feedback – suggest new monitoring based on analysisE.g. monitor system call activity after repeated login
attempts• May need knowledge of OS to analyze data, e.g. crash
dumps
VMs (c) 2005, J. E. Smith 28
Livewire IDSLivewire IDS
Hardware
Virtual Machine Monitor
Policy Modules
Guest OS
Guest Apps
IDS
Policy Engine
OS Interface Library
Query Response
Policy Framework
Command
Callback
ConfigFile
GuestOS
Metadata
Guest VirtualMachine
VMs (c) 2005, J. E. Smith 29
Policy Modules in LivewirePolicy Modules in Livewire
Polling modules• Lie detector module
VMM knows hardware state for each virtual machineLie detector compares this state to the state provided as
feedback from intruder• User program integrity detector module
Compare signatures of memory pages with saved signatures• Signature detector module
Scan memory with signature of known viruses, Trojan horse programs, etc.
Event-driven modules• Memory access enforcer module
VMM intercepts attempts to change page access privileges
VMs (c) 2005, J. E. Smith 30
Dynamic Binary RewritingDynamic Binary Rewriting
Program shepherding• Control execution of program
Prevent program from being attacked
Prevent program from being launching point for attacks
RIO System (MIT)• Based on Dynamo binary optimization system• Target of every control transfer instruction verified
Not to unauthorized locations
Only to safe locations
VMs (c) 2005, J. E. Smith 31
RIO Dynamic Binary Rewriting SystemRIO Dynamic Binary Rewriting System
Two levels of translation• Quick translation (basic blocks)• High performance translation (superblocks)
Security Checks• All code inspected during translation• All control transfers are checked before caching/table placement• Code cache and map table are protected• Small performance loss
Basic Block Cache Superblock Cache
Indirect BranchLookup Routine
Dispatch Routine
Basic Block Builder Superblock Selector
START
Application Mode
RIO Mode
VMs (c) 2005, J. E. Smith 32
Migration of Computing EnvironmentsMigration of Computing Environments
Identical environment at any work location• When moving from one location to another
E.g. Home to work and back• Effect similar to carrying hardware back and forth
Physical security has to be taken care of Entire state of machine must be transported
• State of processor resources
For OS as well as applications• Includes active code and data
Concept of a capsule• Compressed information about entire system
Can be transported from one location to another
VMs (c) 2005, J. E. Smith 33
Virtual ComputersVirtual Computers
Encapsulation simplified through use of virtual machines
Encapsulation has the effect of checkpointing
• Suspend operation on one platform and resume execution at exactly same point on another platform
Hardware 1
Virtual Machine Monitor 1
Guest OS
Guest Apps
Virtual Machine
Hardware 2
Virtual Machine Monitor 2
Guest OS
Guest Apps
Virtual Machine
Traditional DataMigration
Hardware 1
OS1
Apps1
Hardware 2
OS2
Apps2Dat
a
Dat
a
VM Migration
VMs (c) 2005, J. E. Smith 34
VMotion (VMware)VMotion (VMware)
Migration of virtual machines in commercial environment
• Load balancing• Security, e.g. quarantine
attacked machine• Co-location• Fault-tolerance• Power management• Maintenance
VC ManagementServer
VC Client(User 1)
VC Client(User 2)
VC Client(User 3)
VC Client(User 4)
VCDatabase
VM1 VM2 VM3
DataStore
VCagent
hostA
VM4 VM5 VM6
SAN
VCagent
hostB
VM7 VM8 VM9
VCagent
hostC
VMs (c) 2005, J. E. Smith 35
Migration StepsMigration Steps
Step 1: Ensure that VM is stable on current host Step 2: Perform baseline copy
• Copy of current memory state and data Step 3: Suspend VM on current host Step 4: Perform final copy
• Send incremental capsule containing changes since baseline copy Step 5: Activate VM on new host
VMs (c) 2005, J. E. Smith 36
Grids: Virtual OrganizationsGrids: Virtual Organizations
Virtual Organization Q
Ray Tracing using cycles providedby cycle-sharing consortium
“Participants in Pcan run Program A”
Virtual Organization P
Multidisciplinary design usingprograms and data at multiple
locations
“Participants in Q canuse idle cycles if
budget not exceeded”
“Participants in Pcan run Program B”
“Participants in Pcan use Data D”
VMs (c) 2005, J. E. Smith 37
Comparison with Conventional VMsComparison with Conventional VMs
Efficient utilization of resources• Similar in motivation to original system VMs
Sharing of resources• Grid concerned with sharing of content also
Not just sharing of resources Distributed control
• Grid has global scopeUsers negotiate with each other to share and use
resources Heterogeneous nodes
• Nodes in a grid may be different types of machines Adaptation of applications
• Applications may need to be adapted for the grid Portability of applications
• Conceptually similar to goals of HLL VMs
VMs (c) 2005, J. E. Smith 38
Role of System VMs in a GridRole of System VMs in a Grid
Grid has to manage and schedule resources• Like an operating system
However, grid has to deal with heterogeneity• Accounting, for example, is dependent on
accounting policies of each grid participant System VM-based approach
• Treat a VM as the unit of transactions on a grid
Not tasks, or programs
( Figuieredo and Fortes)
VMs (c) 2005, J. E. Smith 39
System-VM Based GridSystem-VM Based Grid
Virtual Machines(Back End)
Application Server(Front End F)
The Internet
Physical Server P
User X
InformationService
Image Server I
Data Server D
V1 V2 V3 Vn
VMs (c) 2005, J. E. Smith 40
Advantages of SVM based ApproachAdvantages of SVM based Approach
User isolation• Protect user from host and other users• Protect host from users
Platform independence• User specifies type of machine, not actual machine
Task management and accounting• Simplifies allocation and accounting
Allocate based on compute requirementsCharge based on performance of VM
Portability• Allows applications to be written for execution on the widest
range of platforms• Eases encapsulation and migration of jobs between nodes
on grid; e.g. Java VMs can be migrated
VMs (c) 2005, J. E. Smith 41
Co-Designed Virtual MachinesCo-Designed Virtual Machines
Separate the hardware/software interface from the ISA level of abstraction Restore the ISA to its “natural” place
as an Implementation ISA that reflects actual hardware Support existing ISAs
as a Virtual ISA Let processor designers use both hardware and software A form of system VM
OS
libs.
User Applications
V-ISA
I-ISA
Hardware
Software
Hardware
OS
libs.
User Applications
ISA
VMs (c) 2005, J. E. Smith 42
Co-Designed VMsCo-Designed VMs
Should be of interest to both architects and micro-architects
• Offers opportunities for performance, power saving, fault tolerance and other implementation-dependent features
• Allows transcending conventional ISAs• IBM Daisy and Transmeta Crusoe• Don’t confuse them with VLIW!
“pioneers are the ones with arrows in their backs”
VMs (c) 2005, J. E. Smith 43
Architecture Issues: Concealed MemoryArchitecture Issues: Concealed Memory
VM software resides in memory concealed from all conventional software
Source ISA Data
CodeCache
VM Code
ICacheHierarchy
DCacheHierarchy
ProcessorCore
Source ISA Code
VM Data
concealed memory
conventionalmemory
VMs (c) 2005, J. E. Smith 44
Another Way of Doing ThingsAnother Way of Doing Things
conventional
dynamic translation
Code CacheProcessor
Pipeline
Software
Translator
Main Memory
Func.Unit
Func.Unit
. ..
Main MemoryCache
HierarchyProcessor
Pipeline
TranslationUnit
(form uops)
Func.Unit
Func.Unit
Func.Unit
. ..
TranslationUnit
(form uops)
CacheHierarchy
VMs (c) 2005, J. E. Smith 45
Fused Instruction SetFused Instruction Set
Co-designed VM x86 implementation• Shorten and simplify pipeline front-end
Combine pairs of dependent instructions• For single “unit” for pipeline processing
Use VM software to• “Crack” x86 instructions into RISC-ops• Re-order RISC-ops• Reassemble into (new) fused pairs
Related: Pentium-M fuses in front-end• Using original x86 instructions• “Reduced Splitting” is more accurate description
VMs (c) 2005, J. E. Smith 46
Fusing ProfileFusing Profile
About 50% of operations are fused Only 5-10% of non-fused are single-cycle ALU ops
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
164.
gzip
175.
vpr
176.
gcc
181.
mcf
186.
craf
ty
197.
parse
r
252.
eon
253.
perlb
mk
254.
gap
255.
vorte
x
256.
bzip2
300.
twolf
Avera
ge
Per
cen
tag
e o
f D
ynam
ic I
nst
ruct
ion
s
ALU
FP or NOPs
BR
ST
LD
Fused
VMs (c) 2005, J. E. Smith 47
PerformancePerformance
-10
0
10
20
30
40
50
60
70
164.g
zip
175.v
pr 17
6.gcc
18
1.mcf
186.c
rafty
197.p
arser
252.e
on
253.p
erlbm
k 25
4.gap
25
5.vort
ex
256.b
zip2
300.t
wolf
Harm
onic
Nom
arliz
ed IP
C s
peed
up (%
)
M0: Base + Code Cache M1:= M0 + fusing M2:= M1 + shorter pipe Macro-op:= M2 + 3-1 ALU
VMs (c) 2005, J. E. Smith 48
SummarySummary
Many types of VMs• But common implementation technologies
A smart interconnect component• Should be studied/taught as a discipline on its own• Alongside OS, Application SW, HW
Many avenues for research• Lots of applications• Architecture meta-issues –
What features of OS, Applications, HW are “VM friendly”?
E.g. Goldberg work in early 70s for system VMs• Primitives for supporting VMs