[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] on_exit_remove in submit file



Hi
 
We are running a windows pool in which a non-constant, random, small percentage
of execute nodes occasionally return the old -10737**** type errors. By using:
 
== 0)

in the submit file these are detected, requeued, and run OK somewhere else.
 
Just recently a user has also had issues with jobs returning immediately
and with the output file (specified in the output = statement) being 0 bytes in size
(it should contain a fair bit of data). If manually resubmitted, it will run OK.
This user has asked if there is a way to do something like:
 
== 0)  && (OutputFileSize != 0)
 
where obviously OutputFileSize is the size of the file specified in output =
 
While looking into this problem to figure out why this is happening for their jobs
I thought it might be useful if it was possible to do such a thing as specified in
the line above. I'm guessing it would need to be kludged in some way because
as far as I know their is no such classad, or maybe it's not possible at all.
 
I'd appreciate people's comments or suggestions on this.
 
Thanks
 
Cheers
 
Greg