[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Partial re-submit



On 8/12/05, Steffen Grunewald <steffen.grunewald@xxxxxxxxxx> wrote:
> Hi,
> 
> a coworker of mine just asked me how it would be possible to re-submit
> part of a job cluster - apparently he found some issues with selected
> input files and would like to re-run the associated jobs.
> I suggested to re-build the submit file in such a way that there would be
> "gaps" with dummy Executables (say, sleep 30), the Queue count set to
> the first number to be repeated (thus creating job 0..n-1), then set
> Executable to its real value and Queue 1, and so on. Unfortunately, the
> current condor_submit man page doesn't allow this.
> Another way would be to set up a translation table which would require
> a wrapper script. Since this looks like a horrible kludge, and would be
> no longer self-consistent: is there an easier way to do this? Are there
> plans to allow specification of ranges and lists within the Queue
> command? (e.g. 0-211,245-999)

work around:

when submitting add a user defined variable

like

+KillMe = "True"

or

+KillMe = "False"

for those jobs which you wish to skip/run respectively

submit the cluster on hold ("hold = true")

then run the following command

condor_rm -constraint "KillMe=="True""

release the remaining jobs....

A less hacky alternate is to redefine the requirements for the non
runnable jobs but then negotiate all jobs in cluster would need to be
switched on which isn't great so the former way is far less error
prone

Matt