[Condor-users] on_exit_remove and multiple runs into a single job cluster


I am trying to submit multiple runs from a single submit description file and want the jobs that do not succeed (RC=129) to return to the queue waiting for another oportunity to run. I am using the on_exit_remove command as follow:

universe                = vanilla
on_exit_remove          = (ExitBySignal == TRUE) || (ExitCode != 129)
run_as_owner            = false
Executable              = teste-bind.bat
Arguments               = ENG_TESTE 5
InitialDir              = caso-$(process)
when_to_transfer_output = ON_EXIT
output                  = teste-bind-$(process).out
error                   = teste-bind-$(process).err
Log                     = ..\teste-bind-job.log
queue 20

Under some conditions the exitcode is set to 129 meaning that the job is to be placed back into the Idle state and tried to be run in the next negotiation cycle.

My problem is that all jobs that run terminating with RC=129 were terminated instead of placed back into the idle state.

I have run this procedure with queue 1 and it have run to the end if the condition of error was not met and performed accordingly when the error condition was raised placing the job back into the idle state.

It seems that the on_exit_remove _expression_ is not valid for multiple runs as stated in my submit description file. Is there any way to make it associated for each single run?

Attached is the log file of the runs.


