[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] on_exit_remove and multiple runs into a single job cluster



According to your log your job is exiting with code 0, not 129. So I'd expect them to be marked as Completed and removed instead of put back into the Idle state.

You should run a program that you know exits with code 129 to test this. I just did and it worked fine, though I'm not running Windows. If the program is exiting with 129 and Condor is not recognizing that on Windows then that would be a bug.

Best,


matt

kschwarz@xxxxxxxxxxxxxx wrote:
Hi,

I am trying to submit multiple runs from a single submit description file and want the jobs that do not succeed (RC=129) to return to the queue waiting for another oportunity to run. I am using the on_exit_remove command as follow:

#
universe                = vanilla
on_exit_remove          = (ExitBySignal == TRUE) || (ExitCode != 129)
run_as_owner            = false
Executable              = teste-bind.bat
Arguments               = ENG_TESTE 5
InitialDir              = caso-$(process)
when_to_transfer_output = ON_EXIT
output                  = teste-bind-$(process).out
error                   = teste-bind-$(process).err
Log                     = ..\teste-bind-job.log
queue 20

Under some conditions the exitcode is set to 129 meaning that the job is to be placed back into the Idle state and tried to be run in the next negotiation cycle.

My problem is that all jobs that run terminating with RC=129 were terminated instead of placed back into the idle state.

I have run this procedure with queue 1 and it have run to the end if the condition of error was not met and performed accordingly when the error condition was raised placing the job back into the idle state.

It seems that the on_exit_remove expression is not valid for multiple runs as stated in my submit description file. Is there any way to make it associated for each single run?

Attached is the log file of the runs.

Klaus
This message is intended solely for the use of its addressee and may contain privileged or confidential information. If you are not the addressee you should not distribute, copy or file this message. In this case, please notify the sender and destroy its contents immediately. Esta mensagem é para uso exclusivo de seu destinatário e pode conter informações privilegiadas e confidenciais. Se você não é o destinatário não deve distribuir, copiar ou arquivar a mensagem. Neste caso, por favor, notifique o remetente da mesma e destrua imediatamente a mensagem.


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/