[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor held jobs should retry/release after certain configured timeout automatically



On Wed, Apr 8, 2015 at 7:26 AM, Sridhar Thumma <deadman.den@xxxxxxxxx> wrote:

> SYSTEM_PERIODIC_RELEASE=((NumSystemHolds < 5 && (time() -
> EnteredCurrentStatus) > 30) &&
> (HoldReason.substr("InvalidAMIID.NotFound",0)!=""))
>
That's not how substr is called. I'm not sure substr would be all that
helpful here anyway.

>  SYSTEM_PERIODIC_RELEASE=((NumSystemHolds < 5 && (time() -
> EnteredCurrentStatus) > 30) && regexp("^.+InvalidAMIID.+$",HoldReason))
>
It looks like the regexp parsing doesn't like the use of ^ and $. You
might try dropping that. I did a similar test for sleep jobs in my
history (version 8.3.2):

-bash-3.2$ condor_history -const 'regexp("^.+sleep.+$", Cmd)' | wc -l
1
-bash-3.2$ condor_history -const 'regexp("sleep", Cmd)' | wc -l
5146
-bash-3.2$

Since you have a held job in the queue, you can use condor_q with a
constraint to test your SYSTEM_PERIODIC_RELEASE expression before you
set it.


Thanks,
BC

-- 
Ben Cotton
main: 888.292.5320

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing