Upload
magdalene-simmons
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Goals• Give processes suitable places for computing.• Take processes to the place suitable.
• Transparency– OS abstraction (SSI).– Hardware abstraction
• supports heterogeneous system• Seamless– Dynamically switch workplace.
Challenges
• Security– Prevent malicious software?
• Availability– The availability of each computer is high today.
How about the whole distributed system?
• Scalability– Provide just-in-time scalability?
Application scenarios
• Build your data center.– Exchange between data migration and
computation migration.• Build your virtual cluster.– PC + PC– PC + Mobi– Mobi + Mobi
• Build your test bed.– Easy deployment
Related works
• XtreemOS– Grid operating system supporting virtual organizations.– Latest version: 3.0 (2012/02/10)
• OpenSSI– Provide single process space, single I/O space, single IPC
space, and single root for the cluster.– Latest version: 1.9.6 (2010/02/18)
• OpenMosix– A Linux kernel extension for single-system image
clustering– Latest version: 2.4.26 (closed)
Related worksOpenSSI OpenMosix Kerrighed
Single Administrative Domain Yes No Yes
Cluster Membership Guarantees Yes No No
High-Availability (HA) Clustering Yes No No
Single Process Management Space Yes No Yes
Process Migration Full Partial Partial
Process Load Balancing Yes Yes Yes
Process Migration is HA Yes No No
Migrate Processes Using Semaphores Yes Yes Yes
Migrate Processes Using Shared Memory Yes No Yes
Single Thread Migration Only thread group No No
Process Checkpointing 3rd party 3rd party Yes
Single IPC Namespace Yes No Yes
http://wiki.openssi.org/go/Features
Related worksOpenSSI OpenMosix Kerrighed
Distributed Shared Memory Migrates No Yes
Single PTY Namespace Yes No No
Single-Site File Naming Yes No ?
Coherent Cache / File Access Yes ? ?
Migrate Active Filesystem Experimental No No
HA Cluster Filesystem (CFS) Yes No No
HA Single Cluster Name/Address Yes No No
HA Network Load Balancer (LB) Yes No No
LB Auto Detects TCP/UDP IP Services Yes (UDP 1.9.1) N/A N/A
Transparent Socket Migration Not yet No Yes
Automatic Service Failover Yes No No
Diskless Nodes via Network Boot Yes (PXE/EtherBoot) No Yes
http://wiki.openssi.org/go/Features
Design
• Virtualization(1)
OS
VM VM
PM
App App
OS OS
OS
VM VM
PM
App App
OS OS
OS
VM VM
PM
App App
OS OS
Design
• Virtualization(5)
PM
App App
PM
App App
PM
App App
OS
VM VM
PM
App App
OS
OS
VM VM
PM
App App
OS
VM VM
PM
App App
Center User
Design
• Basic Architecture
OSWING
User Space
Kernel Space
Kernel Layer
Translation Layer
Link Layer
Resource Layer
WTRANS
WLN
WRES
Design
• Metaphor
sys ops
fileops
resres res
WTRANS
WLN
WRES
Object
OperationSystem call Path File operation
…/class_key_entry
op
id
flags
Metaphor: everything is a file
Subject
object
subject
Design
• Resource– Computation resource• Process
– Data resource• File• Memory
– Common resource• IPC: msg, sem, shm• …
Design
• AreaBasically, an operating system has two areas for different processes.– Local area: local computing.– Global area: distributed computing. /mnt/repo • The resources in local area are invisible to the
processes in global area.• The resources in global area are consistent.
Design
• Isolation– Process Isolation• Processes in WING are “invisible” to the world outside.
– Data Isolation• Data in WING is “invisible” to the world outside.• The private data of one user is “invisible” to other
users.– Private files.– The private data inside a file.
Implementation
• Components
WTRANS WLN WRES
Common Resource
WCR CFS
Computation Resource
IFS
Data Resource
OS
VM
WMAN
LDFS
WTRANS: kernel translatorWLN: linkerWRES: common resourceWCR: checkpoint kernel moduleCFS: checkpoint file systemWMAN: process managerLDFS: lightweight distributed file systemIFS: image file system
Implementation
• Checkpoint Record the status of a process (transparent to the programmer)– Registers– Opened files– Signals– Credentials– Memory– IPC
Implementation
• Checkpoint– Incremental saving– Dynamic recovery– Dedicated file system
dirty
dirty
dirty
dirty
remove
new
new
removeremoveremove
removeremove
removeremove
remove
new
new
removeremove
remove
CFS
mapmemory
Implementation
• Distributed IPC message 、 semaphore 、 shared memory– System V IPC does not support distributed
communication.– It is complex to dump and to restore the IPC
status in kernel.
Implementation
• Distributed IPC– Coexist of conventional System V IPC and
distributed IPC (the same interface).– Ensure the consistency of IPC resources in the
distributed environment.– Stateless in kernel (for process migration)• Re-implement IPC in user-space. • provide a pseudo file system to store the status of the
IPC resource (RAM based).– High availability
Can process?
Yes
Request
No
Can trigger another event?
Send waiting indexSend result
Yes No
Stop
Owner
Proposer
Implementation
• Distributed IPC– Event Driven
Event-Flow
Implementation
• Msg:Requirements:1) msgtyp = 0 : the first message on the queue is received.2) msgtyp > 0 : the first message of type msgtyp is received.3) msgtyp < 0 : the first message of the lowest type that is less than or equal to the absolute value of msgtyp is received.
• Sem Requirements:
1) RPC has a time-out mechanism (for semtimedop). 2) RPC has an undo mechanism (for exit_sem).
Implementation• Shared memory Problems:– Find (key, value) pair– Frequently update of (key, value) Consistency model:–Sequential consistency Features:–Multi-owner–Versioning–Write invalidation
Implementation
• Shared memory Handle shm_fault
proposer
page owner
reader
reader
writer
shm owner
1
23
3
3
44
4
Case 1:
proposer
shm owner
1
2
Case 2:
page ownerproposer
page owner
shm owner
1
2
3
Case 3:
Implementation
• Shared memory Handle shm_fault
proposer
page owner
reader
writer
1
1
1
2
3
reader
Case 4:
2
2
proposer
reader
writer
1
1
1
2
3reader
Case 5:
2
2
page owner
2
Implementation
• Image File System (IFS) Data can be shared, but the data for each user
needs to be protected.– Each user can have a different view of a file.– The processes of the same user have the same
view of a file.
Implementation
• Image File System
Bitmap State Image Source
User A
User B
File 1
File 2
File 3
Bitmap
Bitmap
Bitmap
User A
User B
File 1
File 2
File 3
Bitmap
Bitmap
Bitmap
File 1
File 2
File 3
Results
• Environment:– VM (x2): 512MB RAM, 2 processors, NAT– OS: Based on kernel 2.6.29.6– Host: 2048MB RAM, Intel Core 2 Duo CPU T6570
• Experiments:– Msg– Sem
Results
• MsgLeader:1.Use msg(key0) to collect start requests from members;2.If all start requests are received then 3. Use msg(keyi) (i > 0) to send starti;4. Use msg(keyN+1) to collect stop requests;5. if all stop requests are received then6. return success;7.return fail;Member i:1.Use msg(key0) to send start request to leader;2.If starti is successfully received by msg(keyi) then 3. create_process(msg_snd);4. create_process(msg_rcv);5. if msg_snd and msg_rcv are finished then6. Use msg(keyN+1) to send stop request to leader;
Process msg_snd1. for n = 1 to ROUNDS do2. Use msg(key0) to receive req;3. Build mtext (|mtext| = req.msgsz);4. i := req.src;5. Use msg(keyi) to send mtext;
Process msg_rcv at memberi1. for n = 1 to ROUNDS do2. req.msgsz := rand() % MSG_SIZE + 1;3. req.src := i;4. Use msg(key0) to send req;5. Use msg(keyi) to receive mtext;
1 leader, 1 member, ROUNDS = 1000, MSG_SIZE = 128, time = 65.36134 sec
Results
• sem
1 leader, 1 member, ROUNDS = 1000, time = 8.810316 sec
Leader:1.Create sem(key0) (sem(key0) has N items);2.Assign each item of sem(key0) to 0;3.If all items of sem(key0) are not 0 then 4. for i = 1 to N do5. remove sem(keyi);
Member i:1.Create sem(keyi); 2.If all sem(keyj) (j ≠ i) are created then 3. for n = 1 to ROUNDS do4. k := rand() % N + 1;5. down(sem(keyk));6. up(sem(keyk));7. endfor8. sem(key0).itemi := 1;
Conclusions
• By operating system virtualization, WING provides processes in different nodes with consistent views of the distributed resources.
• There are no additional libraries required.• Conventional multi-task applications can be used for
distributed computing.
Future work
• Components– WMAN: process manager– LDFS: lightweight distributed file system
• Tools:– Profiler– Test suits
• Stability• Security