[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] equivalent job array option for condor_submit



On 8/4/2016 3:28 PM, Upton, Stephen (Steve) (CIV) wrote:
Hi All,

Is there an equivalent to job array like in SLURM or PBS where you can
submit specific job numbers? I’m aware of queue, but that seems to queue
up a specified number of  jobs, starting from 0. I’m looking for the
ability to specify a set of job numbers or ranges, that kind of thing. I
didn’t find anything googling nor in the docs (but could be looking in
the wrong place or using a different terminology).



I assume you want to specify the job number ranges because you ultimately want to pass this job number to your program as a command-line parameter or environment variable or some such?

The answer is that in HTCondor v8.4 and above, the "queue" command in condor_submit can take a lot more than just a specified number. It can queue up per items in a list, or a job per line returned from a script, or a job per lines in a file, or a job per file found in a directory, etc. See the condor_submit manual page at https://is.gd/VlJDtI (man condor_submit), and also it may be of interest to see a talk presented at HTCondor Week on this topic at https://is.gd/1h9L67

Here are a couple examples.

Lets say you want to run /bin/sleep, and the command-line argument to sleep is an item from a specific list (similar to slurm job array id); here it is:

  executable = /bin/sleep
  arguments = $(Item)
  queue in (0,5,6,9,11,26,99)

The above will result in 7 jobs being submitted, sleep 0, sleep 5, sleep 6, sleep 9, sleep 11, ...

Another example which leverages /usr/bin/seq which should be available on just about every Linux box :

  executable = /bin/sleep
  arguments = $(Item)
  # the "seq" program want MIN STEP MAX
  queue from seq 0 5 50 |

The above example runs seq to generate a list of numbers 0,5,10,15,...50 to stdout, and the condor_submit will submit a job for each line of output. So the end result will be 11 jobs with the argument to sleep being 0, 5, 10, 15 etc. Behold after submitting the above:

$ condor_q -af clusterid procid cmd args
66526376 0 /bin/sleep 0
66526376 1 /bin/sleep 5
66526376 2 /bin/sleep 10
66526376 3 /bin/sleep 15
66526376 4 /bin/sleep 20
66526376 5 /bin/sleep 25
66526376 6 /bin/sleep 30
66526376 7 /bin/sleep 35
66526376 8 /bin/sleep 40
66526376 9 /bin/sleep 45
66526376 10 /bin/sleep 50

Hope the above helps,
Todd