[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] MaxJobRetirementTime



> MaxJobRetirementTime is described in the manual as the 'maximum time
in
> seconds to let this job run uninterrupted before kicking it off when
it
> is being preempted.' If MaxJobRetirementTime is set to some large
value,
> does this mean that a job which has a higher priority than the one
> currently running on the node has to wait until the
MaxJobRetirementTime
> is reached or will the negotiator choose another machine that has a
> lower MaxJobRetirementTime hence allowing a waiting job to be
scheduled
> on a resource that is going to allow it to be allocated faster?

A job with higher priority will wait for MaxJobRetirementTime to expire
before forcefully kicking the lower priority job off the machine. And
the REQUEST_CLAIM_TIMEOUT value determines how long the higher priority
job is willing to wait before it tries to re-negotiate for another
machine. If this value is high, it will wait a long time, if it's low it
will try for a bit then give up it's preemption request and go back to
the negotiator and try again for another machine.

When negotiating there is no heed paid to MaxJobRetirementTime vs.
REQUEST_CLAIM_TIMEOUT unless you tell the negotiator that this is
important to your job. You can do this by setting the
NEGOTIATOR_PRE_JOB_RANK and NEGOTIATOR_POST_JOB_RANK expressions to
include references to the MaxJobRetirementTime value on the remote host,
sorting machines from shortest time to longest before beginning to match
jobs to them.
 
- Ian