Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] PERIODIC_HOLD is applied extremely infrequently
- Date: Mon, 11 May 2015 11:18:28 -0500
- From: Vladimir Brik <vladimir.brik@xxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] PERIODIC_HOLD is applied extremely infrequently
I added D_FULLDEBUG and "Evaluated periodic expressions" lines appear is
SchedLog as expected. For example:
Evaluated periodic expressions in 0.301s, scheduling next run in 60s
My periodic hold expression is defined like this:
rss_max = 6000
mem_hold = ((isUndefined(ResidentSetSize_RAW) =?= False &&
isUndefined(RequestMemory) =?= False && ResidentSetSize_RAW/1000 >
RequestMemory \
&& ResidentSetSize_RAW/1000 > 6000) =?= True)
SYSTEM_PERIODIC_HOLD = ((JobStatus == 2 && JobUniverse == 5 &&
$(mem_hold) && isUndefined(RemoteHost) =?= False && regex("gzk9000c",
RemoteHost) =!= True) =?= True)
For testing, I tried using this:
SYSTEM_PERIODIC_HOLD = (JobStatus == 2 && JobUniverse == 5 && Owner ==
"vbrik")
The interesting thing about the expression above is that it puts *some*
jobs on hold immediately after they start running (as expected), but
jobs that weren't put on hold immediately after starting are never put
on hold.
While debugging, I am also using this:
PERIODIC_EXPR_INTERVAL = 60
MAX_PERIODIC_EXPR_INTERVAL = 300
PERIODIC_EXPR_TIMESLICE = .9
Vlad
On 05/08/15 15:51, Ben Cotton wrote:
Vlad,
You should see lines like:
05/08/15 16:45:51 (pid:2968) Evaluated periodic expressions in 0.000s,
scheduling next run in 300s
in your sched log (assuming SCHEDD_DEBUG includes D_FULLDEBUG). If you
see that at the expected interval (based on your
PERIODIC_EXPR_INTERVAL setting) then it's probably a problem in your
SYSTEM_PERIODIC_HOLD expression. Could you share that? If it doesn't
show up at the expected time, we'll have to try something else.
Thanks,
BC