Upload
kyle-mooney
View
223
Download
3
Tags:
Embed Size (px)
Citation preview
Workload Management
David CollingImperial College London
• Release 2 is not based on release 1
• Whole new architecture (pretty much described in D1.4)
• More modular
• I have little practical experience of this new architecture (yet).
So what is the new architecture?
See D1.4 for details…
The architecture
User Interface:Although there have been several changes to the architecture, the commands available at the user end are (almost) the same… now edg-job-submit etcAlso now apis
Network ServerThe Network Server is a generic network daemon, responsible for accepting incoming requests from the UI (e.g. job submission, job removal), which, if valid, are then passed to the Workload Manager.
The architecture
Workload manager:The Workload Manager is the core component of the Workload Management System. Given a valid request, it has to take the appropriate actions to satisfy it.
To do so, it may need support from other components, which are specific to the different request types.
The architecture
Resource Broker:This has been turned into one of the modules that help the workload manager, actually 3 sub-modules…• Matchmaking• Ranking• Scheduling
Job Adapter The Job Adapter put the finishing touches to the job’s jdl and creates the job wrapper.
The architecture
Job Controller and CondoGActually submit the job to the resources and track progress.
So how does this all work…
Job submission example (for a “simple” job)
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
Job submissionUI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
edg-job-submit myjob.jdlMyjob.jdl
JobType = “Normal”;Executable = "$(CMS)/exe/sum.exe";InputData = "LF:testbed0-00019";ReplicaCatalog = "ldap://sunlab2g.cnaf.infn.it:2010/rc=WP2 INFN Test Replica Catalog,dc=sunlab2g, dc=cnaf, dc=infn, dc=it";DataAccessProtocol = "gridftp";InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"};OutputSandbox = {“sim.err”, “test.out”, “sim.log"};Requirements = other. GlueHostOperatingSystemName == “linux" && other. GlueHostOperatingSystemRelease == "Red Hat 6.2“ && other.GlueCEPolicyMaxWallClockTime > 10000;Rank = other.GlueCEStateFreeCPUs;
submitted
Job Status
UI: allows users to access the functionalitiesof the WMS
Job Description Language(JDL) to specify job characteristics and requirements
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
Input Sandboxfiles
Job
waiting
submitted
Job Status
NS: network daemon responsible for acceptingincoming requests
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
waiting
submitted
Job Status
WM: responsible to takethe appropriate actions to satisfy the request
Job
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
waiting
submitted
Job Status
Match-maker
Where does thisjob must be executed ?
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
waiting
submitted
Job Status
Match-Maker/ Broker
Matchmaker: responsible to find the “best” CE where to submit a job
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
waiting
submitted
Job Status
Match-Maker/ Broker
Where are (which SEs) the needed data ?
What is thestatus of the
Grid ?
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
waiting
submitted
Job Status
Match-maker
CE choice
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
waiting
submitted
Job Status
JobAdapter
JA: responsible for the final “touches” to the job before performing submission(e.g. creation of wrapper script, etc.)
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
Job Status
JC: responsible for theactual job managementoperations (done via CondorG)
Job
submitted
waiting
ready
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
CE characts& status
SE characts& status
RBstorage
Job Status
Job
InputSandboxfiles
submitted
waiting
ready
scheduled
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
RBstorage
Job Status
InputSandbox
submitted
waiting
ready
scheduled
running
“Grid enabled”data transfers/
accesses
Job
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
RBstorage
Job Status
OutputSandboxfiles
submitted
waiting
ready
scheduled
running
done
Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
RBstorage
Job Status
OutputSandbox
submitted
waiting
ready
scheduled
running
done
edg-job-get-output <dg-job-id>Job submission
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
Inform.Service
ComputingElement
StorageElement
RB node
RBstorage
Job Status
OutputSandboxfiles
submitted
waiting
ready
scheduled
running
done
cleared
Job submission
UI
Log Monitor
Logging &Bookkeeping
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ComputingElement
RB node
LM: parses CondorG logfile (where CondorG logsinfo about jobs) and notifies LB
LB: receives and stores job events; processes corresponding job status
Log ofjob events
edg-job-status <dg-job-id>
Job status
Logging and bookkeeping.
New functionality…
Release 2 of WP 1 software
New functionality includes:• MPI job submission•User APIs•Accounting infrastructure (Management have decided not to deploy this for testbed 2) •Interactive job support•Job logical checkpointing
New functionality…
All these are implemented…
Specify which sort of job using the JobType classad e.g.
JobType = “Checkpointable”
However only tested on the WP 1 testbed as yet…
Don’t have time to go through all of these so will just will just go through checkpointing.
Job checkpointing scenario
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
ReplicaCatalog
RB node
submitted
Job Statu
s
UI: allows users to access the functionalitiesof the WMS
edg-job-submit jobchkpt.jdljobchkpt.jdl
[JobType = “Checkpointable”;Executable = "hsum.exe";StdOutput = Outfile;InputSandbox = "/home/user/hsum.exe”, OutputSandbox = “Outfile”, Requirements = member("ROOT", other.GlueHostApplicationSoftwareRunTimeEnvironment) && member("CHKPT", other.GlueHostApplicationSoftwareRunTimeEnvironment);Rank = -other.GlueCEStateEstimatedResponseTime;]
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job Description Language(JDL) to specify job characteristics and requirements
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
submitted
waiting
ready
scheduled
running
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job
InputSandboxfiles
Job
Job
Job
Input Sandboxfiles
Job Match-maker
JobAdapter
1
4
321
6
6
5
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
submitted
waiting
ready
scheduled
running
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job
From time to timeuser’s job asks to savethe intermediate state
…<save intermediate files>;State.saveValue(“var1”, value1>;…State.saveValue(“varn”, valuen);State.saveState();…
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
submitted
waiting
ready
scheduled
running
Logging &Bookkeeping
Server
Saving ofjob state
Saving ofintermediate files
ComputingElement X
ComputingElement Y
Job
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
submitted
waiting
ready
scheduled
running
done (failed)
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job
Job fails(e.g. for a CEproblem)
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
Match-maker
Rescheduleand resubmit job
submitted
waiting
ready
scheduled
running
done (failed)
waiting
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job
Job
Where must thisjob be executed ? Possiblyon a different CE where the job was previously submitted …
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
Match-maker
CE choice:CEy
submitted
waiting
ready
scheduled
running
done (failed)
waiting
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
CE characts& status
RBstorage
Job Status
JobAdapter
ComputingElement X
ComputingElement Y
Logging &Bookkeeping
ServerJob
ready
scheduled
running
done (failed)
waiting
ready
ComputingElement X
ComputingElement Y
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
Job
InputSandboxfiles
ready
scheduled
running
done (failed)
waiting
ready
scheduled
Logging &Bookkeeping
Server
Job
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
Retrieval of last savedstate when job starts
Retrieval ofintermediate files(previously saved)
scheduled
running
done (failed)
waiting
ready
scheduled
runningComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job
UI
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
RBstorage
Job Status
Job
scheduled
running
done (failed)
waiting
ready
scheduled
runningComputingElement X
ComputingElement Y
Logging &Bookkeeping
Server
Job
Job keeps runningstarting from the pointcorresponding to theretrieved state (doesn’t needto start from the beginning)
Further additional functionality
The order of implementation is not up to WP 1 people…
Dependent jobs:Using Condor DAGMan
For example…
A = [ Executable = "A.sh"; PreScript = "PreA.sh"; PreScriptArguments = { "1" }; Children = { "B", "C" } ]; B = [ Executable = "B.sh"; PostScript = "PostA.sh"; PostScriptArguments = { "$RETURN" }; Children = { "D" } ]; C = [ Executable = "C.sh"; Children = { "D" } ]; D = [ Executable = "D.sh"; PreScript = "PreD.sh"; PostScript = "PostD.sh"; PostScriptArguments = { "1", "a" } ]
Further additional functionality
Job partitioning will be similar to checkpointing, with the jobs being partitioned according to some variable.
Partitioned jobs will also have a pre-job and aggregator
e.g.
Further additional functionality
JobType = Partitionable; Executable = ...; JobSteps = ...; StepWeight = ...; Requirements = ...; ... ... Prejob = [ Executable = ... Requirements = ...; ... ... Aggregator = [ Executable = ... Requirements = ...; ... ... ];
Further additional functionality
Further additional functionality
Also planned is advanced reservation of resources and co-location.
Much more monitoring and performance quantification…
Summary
• New architecture has been implemented
• Lots of new functionality … but not stress tested
• Further functionality and performance quantification implemented by testbed 3.
Further into the future…EDG will not use OGSA, however the future is in the OGSA grid world.
Work is being done at LeSC (See Steven Newhouse’s talk tomorrow) to wrap the WP 1 components.
Communication via JDML and LBML
Virtualisation of RB through OGSA factory
Use virtualisation to load balance
Increase interoperability