[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] SYSTEM_PERIODIC_HOLD ignored



Hooray!!
It's working now and job's running over time are evicted.
Now to my next project, holding jobs that after 30 minutes of run still don't use more than 10% of the requested memory:
WastingMemory = (JobStatus == 2 && (time() - JobCurrentStartExecutingDate) > 1800) && (RequestMemory > 8192) && (ResidentSetSize/1024 < RequestMemory/10)

I believe that thread gives me all the tools needed to manage that one.

Many thanks,
David


On Thu, Aug 26, 2021 at 4:48 PM Stefano Dal Pra <stefano.dalpra@xxxxxxxxxxxx> wrote:
On 26/08/21 15:12, Stefano Dal Pra wrote:
> [SNIP]
>>
>> That works perfectly for MEMORY_EXCEEDED but totally ignored for
>> TIME_EXCEEDED.
[SNIP]

I stumbled on a somehow survived job running for 21 days, so i forged a
clause to get it held and verify that it works:

TooMuchTime = (jobstatus == 2 && (time() - JobStartDate > 86400 * 7))

This clause works, but it only takes effect after condor restart:
condor_reconfig not enough.

Stefano