44
Small Tutorial on HTCondor Claudia Solís-Lemus 1 March 23, 2015 1 based on slides by Steve Hunter and Lauren Michael

Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Embed Size (px)

Citation preview

Page 1: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Small Tutorial on HTCondor

Claudia Solís-Lemus1

March 23, 2015

1based on slides by Steve Hunter and Lauren Michael

Page 2: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

CSL HTCondor March 23, 2015 2 / 33

Page 3: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 4: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 5: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 6: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 7: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 8: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Real-life example in HTCondor: phylogenetics

Run 1800 quartets, eachwith 4000 genesEach gene: 2 minutesEach quartet: 5.5 daysAll quartets: 30 yearsWith HTCondor: 9 days

CSL HTCondor March 23, 2015 4 / 33

Page 9: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

Job Computer

HTCondor

CSL HTCondor March 23, 2015 5 / 33

Page 10: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

Job Computer

Matchmaker

CSL HTCondor March 23, 2015 6 / 33

Page 11: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

How does HTCondor know what you want/need?

Submit file

CSL HTCondor March 23, 2015 7 / 33

Page 12: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

How does HTCondor know what you want/need?

Submit file

CSL HTCondor March 23, 2015 7 / 33

Page 13: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Simple submit file

CSL HTCondor March 23, 2015 8 / 33

Page 14: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Not so simple submit file

CSL HTCondor March 23, 2015 9 / 33

Page 15: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Not so simple submit file

kbMb

CSL HTCondor March 23, 2015 10 / 33

Page 16: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Another submit file

CSL HTCondor March 23, 2015 11 / 33

Page 17: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Submit to HTCondor

CSL HTCondor March 23, 2015 12 / 33

Page 18: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Check the queue in HTCondor

CSL HTCondor March 23, 2015 13 / 33

Page 19: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Submit to HTCondor: commands

condor_submit submitfilecondor_q usernamecondor_rm jobID or username

CSL HTCondor March 23, 2015 14 / 33

Page 20: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Submit to HTCondor

Submit

nodeExecute

node

HTCondor

desk22 (where job runs)

CSL HTCondor March 23, 2015 15 / 33

Page 21: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

CHTC pools

CSL HTCondor March 23, 2015 16 / 33

Page 22: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

CHTC pools

simon → Mike Camilleri [email protected]

CHTC → chtc.cs.wisc.edu, click "Get Started"

CSL HTCondor March 23, 2015 17 / 33

Page 23: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

CHTC pools

simon → Mike Camilleri [email protected]

CHTC → chtc.cs.wisc.edu, click "Get Started"

CSL HTCondor March 23, 2015 17 / 33

Page 24: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Checklist

X Input filesX My codeX Submit fileX condor_submit submitfileX Monitor/remove jobs

Is it that simple?

CSL HTCondor March 23, 2015 18 / 33

Page 25: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Checklist

X Input filesX My codeX Submit fileX condor_submit submitfileX Monitor/remove jobs

Is it that simple?

CSL HTCondor March 23, 2015 18 / 33

Page 26: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Code in:

C++Java

MatlabR

CSL HTCondor March 23, 2015 19 / 33

Page 27: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Code in:

C++Java

MatlabR

CSL HTCondor March 23, 2015 19 / 33

Page 28: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 20 / 33

Page 29: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 21 / 33

Page 30: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

1 Create R script2 Take your R version and libraries with you3 Download the ChtcRun package and follow theinstructions

4 Run mkdag and submit jobs

CSL HTCondor March 23, 2015 22 / 33

Page 31: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 1: R script

Create your R script

Debug and test thoroughly locally

Organize your input files

Keep in mind what output files are you expecting

Warning: no folder structure in execute node

CSL HTCondor March 23, 2015 23 / 33

Page 32: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 2: R version,libraries

R version (list of options in CHTC website)List of needed R packages (order important): copy source .tar.gz filesYou need to be in a CHTC (or simon) submit node

CSL HTCondor March 23, 2015 24 / 33

Page 33: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 3: ChtcRun package

CSL HTCondor March 23, 2015 25 / 33

Page 34: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 3: ChtcRun package

Organize your input files as:ChtcRun/

Rin/0/ infile.txt, <specific files>1/ infile.txt, <specific files>job2/ infile.txt, <specific files>shared/ RLIBS.tar.gz, yourScript.r, <shared files>

CSL HTCondor March 23, 2015 26 / 33

Page 35: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 3: ChtcRun package

Modify process.template with:

request memory, request disk (make small test first)

wantFlocking? wantGlidein?

CSL HTCondor March 23, 2015 27 / 33

Page 36: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Run mkdag

Run ./mkdag with options

See ./mkdag --help for examples

./mkdag --data=Rin --outputdir=Rout --cmdtorun=soartest.R--pattern=meanx --type=R --version=R-2.10.1

Message: All done!

CSL HTCondor March 23, 2015 28 / 33

Page 37: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Run mkdag

Run ./mkdag with options

See ./mkdag --help for examples

./mkdag --data=Rin --outputdir=Rout --cmdtorun=soartest.R--pattern=meanx --type=R --version=R-2.10.1

Message: All done!

CSL HTCondor March 23, 2015 28 / 33

Page 38: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Submit your jobs

mkdag will instruct you:

cd Routcondor_submit_dag mydag.dag

Don’t forget to monitor your jobs: condor_q and check log files formemory/disk requirements

CSL HTCondor March 23, 2015 29 / 33

Page 39: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Submit your jobs

mkdag will instruct you:

cd Routcondor_submit_dag mydag.dag

Don’t forget to monitor your jobs: condor_q and check log files formemory/disk requirements

CSL HTCondor March 23, 2015 29 / 33

Page 40: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Important things to remember!

Test your code locallyThere is no folder structure in execute nodeTest into HTCondor gradually: 3 jobs, 100 jobs, 1000 jobs...Adjust memory/disk requirementsAvoid transferring large files through submit node (> 10GB perbatch, ∼ 10MB per job, 1000 jobs

CSL HTCondor March 23, 2015 30 / 33

Page 41: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Workflow - HTCondor DAGman

http://research.cs.wisc.edu/htcondor/dagman/dagman.html

CSL HTCondor March 23, 2015 31 / 33

Page 42: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Workflow - HTCondor DAGman

http://research.cs.wisc.edu/htcondor/dagman/dagman.html

CSL HTCondor March 23, 2015 31 / 33

Page 43: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Help resources

chtc.cs.wisc.edu

[email protected]

research.cs.wisc.edu/htcondor/tutorials/

CHTC office hours: Wednesday 9:30-11:30am

CSL HTCondor March 23, 2015 32 / 33

Page 44: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 33 / 33