[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] REQUEST_CLAIM_TIMEOUT and MAXJOBRETIREMENTTIME



Hi Myung,

Yes, REQUEST_CLAIM_TIMEOUT is a configuration variable that you can set.

If you don't want to ever have jobs get matched to a busy machine, and if you do not use the startd RANK expression, then you can achieve that by setting PREEMPTION_REQUIREMENTS=False in the negotiator configuration. If you do that, I would also recommend setting the startd configuration variable CLAIM_WORKLIFE to something reasonable (e.g. an hour).

--Dan

On 7/26/12 10:14 AM, Myung Cho wrote:
Hi,

Is REQUEST_CLAIM_TIMEOUT value something  we can set directly? I don't
see it as a variable in the default condor_config file. I would like
to set it to something shorter, perhaps 1 minute. The current default
seems to be 30 minutes. Here is an example of a log entry:

SchedLog:07/26/12 06:09:11 (pid:12862) Timed out requesting claim
slot4@xxxxxxxxxxx <10.10.26.104:47148> for XXXXX after
REQUEST_CLAIM_TIMEOUT=1800 seconds.

We seem to be running in to multiple occurrences  of this daily and
suspect it is due to preemption and the MAXJOBRETIREMENTTIME value
being set to 3 hours. In the example above, a scheduler gets matched
up with slot4@xxxxxxxxxxx but can't seem to claim it since there is a
job running on c4.XXXX.com which is supposed to be preempted but is
still running due to the grace period specified by
MAXJOBRETIREMENTTIME. BTW, not sure if this is relevant or not but
this is happening across two pools with the scheduler on pool A and
the matched slot on pool B.

My current thought is to reduce the REQUEST_CLAIM_TIMEOUT to something
short like a minute so if the slot is not freed up, it will just move
on to the next free node. Current behavior is for this job to be tied
up even though other slots free up shortly after. Or is there a better
way to handle this issue?


Thanks,

Myung
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/