11
ORNL is managed by UT-Battelle for the US Department of Energy Process Management Adam Simpson OLCF User Support

ORNL is managed by UT-Battelle for the US Department of Energy Process Management Adam Simpson OLCF User Support

Embed Size (px)

Citation preview

ORNL is managed by UT-Battelle for the US Department of Energy

Process Management

Adam SimpsonOLCF User Support

2 Intellectual Property Training August 2014

Process Management

• Daemon Process– Perpetually running program– Run even when user is not logged in

• Scheduled process– Process that runs at scheduled time intervals

• Run process every x hours

• Run process at xx:xx ever day

3 Intellectual Property Training August 2014

Cron

• Don’t use Cron at the OLCF– Jobs subject to be deleted without notice– Cron untility subject to be disabled– Crontab not persistent between OS upgrades– Not robust w.r.t system downtimes– Only available on a small set of resources

4 Intellectual Property Training August 2014

Daemons: Chained Jobs

• Recursively submitted PBS jobs– nth+1 job depends on completion of nth job

• Recommended option for OLCF resources

• size 0 Job– Does not count against allocation– Runs on service node

• Create dependency with qsub flag

-W depend=afterok:{jobid}

=afternotok:{jobid}

=afterany:{jobid}

5 Intellectual Property Training August 2014

Daemons: daemon.py

import time

from datetime import datetime

while True:

print str(datetime.now())

time.sleep(30)

6 Intellectual Property Training August 2014

Daemons: launcher.pbs

#PBS -l walltime=24:00:00

#PBS -l nodes=0

#PBS -A PRJ123

qsub -W depend=afterok:$PBS_JOBID launcher.pbs

python -u $HOME/daemon.py >> $HOME/daemon.out 2>&1

7 Intellectual Property Training August 2014

Daemons: running

• Start

$ qsub launcher.pbs

• Monitor $ showq –u $USER active jobs------------------------

{jobid} Running

blocked jobs-----------------------

{jobid#2} Hold

• Stop

$ qdel {jobid} {jobid#2}

8 Intellectual Property Training August 2014

Scheduled Jobs

• Instead of Cron use qsub

qsub -a [[[[CC]YY]MM]DD]hhmm[.SS]

• [CC] – first digits of year

• [YY] – second digits of year

• [MM] – month

• [DD] – day

• hhmm – hour minute

• [.SS] - seconds

9 Intellectual Property Training August 2014

Scheduled Jobs: sched.py

from datetime import datetime

print str(datetime.now())

10 Intellectual Property Training August 2014

Scheduled jobs: launcher.pbs

#PBS -l walltime=00:05:00

#PBS -l nodes=0

#PBS -A PRJ123

qsub -a 1030 launcher.pbs

python $HOME/sched.py >> $HOME/sched.out 2>&1

11 Intellectual Property Training August 2014

Scheduled jobs: launcher.pbs

#PBS -l walltime=00:05:00

#PBS -l nodes=0

#PBS -A PRJ123

DT=$(date -d “+12 hours” +%H%M)

qsub -a $DT launcher.pbs

python $HOME/sched.py >> $HOME/sched.out 2>&1