Upload
ashlynn-gray
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
2
Outline
Volunteer Computing and CAS@Home
How to port BOSSA big pictureRelated technologyGlue together
Volunteer Computing and BOINC
Volunteer Computing (VC) is an established technology that enables users to contribute to important challenges in fundamental science and medicine, by providing idle time on their PCs and even partaking in data analysis via the Internet.
Berkeley Open Infrastructure for Network Computing (BOINC) is a kind of middleware which allows exploiting the computing resources provided by volunteers. The system currently has more than 300k registered users and 500K registered hosts.
Middleware: BOINC, XtremWeb, Xgrid, Grid MP
Basic Model for VC Based BOINC
The BOINC client is distributed to volunteer PCs , laptops or Clusters.
BOINC client gets the application and input files from server, runs the application and sends the results back to BOINC server
The scientist deploys and manages the BOINC server
and develops the application1
2
3
BOINC Server
BOINC Client Application
APP Input Job Output/CPU time
Control Message
Workflow
User submit a (batch of) job(s) to BOINC server
1
BOINC server generates work unites from the submitted jobs
BOINC server sends job, input files, application to requesting client
Job being executed on the client side, it will suspend if the client hosting machine is “busy”, and continue
Job finished, output data being sent back to BOINC server,
2
An Active client requests job from BOINC server3
4
5
6
CAS@Home
CAS@Home is a BOINC based volunteer computing project based at IHEP-CC
Goal: collecting free and large amount of computing resources from volunteers to support scientific computing from CAS and other institutes.
Current resource: about11000 registered hosts, 8000 registered computers, roughly 60% of them are active
Designed for Multiple application Current application: Scthread Working on porting application: BOSS,
LAMMPS
Porting BOSS
Challenges:
BOSS is heavily platform dependent, but most volunteer computers are windows machine
A big code base: several GB , takes long to download
Solution:
Virtual Machine is used to run provide BOSS running environment
E
BOSS JOB Manage System
BOINC Server
volunteer computer 1
BOINC Client 1Create/start /resume VM
Migrate BOSS job files to VMExecute BOSS job from host
Query job statusPause/stop VM when job finished
SQUID
CERM VM(Execute BOSS jobs)
volunteer computer N
BOINC Client N
CERM VM(Execute BOSS jobs)
CVMFS Server
How it worksBOSS jobs are submitted to BOINC server via the current
BOSS job manage system
BOINC clients run on desktops request jobs from BOINC server, and get jobs.
JOB:
Create/Start/resume a virtual machine with CernVM image
copy BOSS job files to virtual machine shared folder and run it on CernVM
Get job status (create a file in shared folder to indicate job status)
Finish job , pause/poweroff/remove vm
CernVM download /cache BOSS software (only happens once), run BOSS job
Related Technologies
BOINC
CernVM
VirtualBox
BOINC
Client features: Sticky file: files remain at client after job is done (vm
images)
Report on RPC: client report the presence of files to BOINC server
Locality schedule : jobs are scheduled to where files are located
BOINC Wrapper: BOINC Wrapper is used to control the
start/suspension/resume/finish of the application, and report CPU time.
Application can be rewritten with BOINC API to do so(wrapper is not needed in this case)
For Virtual Machine, the wrapper is to create/start/pause/resume/poweoff the hyper visor(VirtualBox)
BOINC developer has finished a generic VM wrapper
History of BOINC wrapper
A specific wrapper has been developed for LHC@home, this is VirtualBox and CernVM based. CoPilot is being used to schedule the jobs which is different from our case.
With CoPilot, the LHC@home wrapper does not have to support host/guest machine file share, and guest control(ie, execute command on guest machine from host machine)
LHC@home wrapper only support one instance of virtual machine running on a host machine.
BOINC developer works on the generic VM wrapper which is still virtualbox based, but supports guest control and multiple instances of vm on a host machine.
CernVM
CernVM is a thin virtual machine image dedicated for LHC experiment users, its basic image is about 250MB, can be run on most Hyper visor (VirtualBox, VMWare, Xen, KVM,Hyper-V server)
LHC and other HEP application can be distributed and run easily to different platforms(Linux/Windows/Mac) via Hypervisor+CernVM image
Current version is 2.2, it provides 3 types of SLC5 based Linux system image : Desktop, basic and BOINC,earlier versions support SLC4
Software distribution and CVMFS
CernVM comes with a network file system CVMFS which delivers the software to CernVM users.
CVMFS is mounted to appear as a local file system to CernVM users. Files will be downloaded and cached on demand.
CVMFS is like a software repository, it currently supports four LHC experiments (ALICE, ATLAS, CMS and LHCb), as well as other experiments and projects (LCD, NA61, and H1), BOSS has also been deployed
CVMFS is http based, no firewall/proxy concerns
CoPilot
CernVM comes with a job schedule system CoPilot (PULL model), XMPP protocol based
CoPilot Client, run on any machine, submit jobs to CoPilot server
CoPilot Server: central service, schedule and deliver jobs/input files to Copilot Agent, receiving output files
CoPilot Agent: run on each CernVM machine, executes jobs
VirtualBox
Hyper visor. free, open source, has versions for Linux/Windows/Mac
Rich command lines to create and control the status of virtual machine:Create vm/ Modify vm/ Start vmSave vm status/Pause vmResume vmPoweoff vm/ release vm /remove vmList running vms/existing vm instanaces
Life cycle:create->start->[pause|resume/save_state]->poweroff->[remove|release]
Glue all pieces together
BOINC VM wrapper is needed to build communication between vm and client.
VM wrapper uses guestcontrol (of virtualbox) to execute commands from host machine on guest machine.
VM wrapper receives signals from client to decide whether pause/resume virtualbox
VM wrapper copies the BOSS related files to a shared folder, so they appear in the virtual machine for execution
For multiple vm, wrapper needs to keep track of the status of each vm.
Glue all pieces together
Only a basic cernvm image is downloaded by the client, and put in the project directory, if to run multiple instances on a host, each instance should have a clone of the image
Input files include image(cermvm, original size 800M), boss scripts and its associated input files.
BOSS software is distributed via CVMFS, a local squid is used to cache the repository
19
Thanks!For more information: http://twiki.ihep.ac.cn/twiki/bin/view/CASAtHome/BOSSonBOINC