[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to have schedd drop claim after each job



Hi Maarten,

I think the feature you want is in the current development branch and will be released in Condor 6.7.2. In the machine policy, you will be able to specify 'MaxJobRetirementTime', an expression that determines the maximum runtime for a job that is in a 'retiring' claim. A claim may go into retirement due to any type of preemption, or due to Condor being gracefully shut down or restarted. It will stay in the retiring state until the current job finishes or the maximum retirement time expires (or the GRACEFUL_SHUTDOWN_TIMEOUT expires).

If you are living in the stable series, then there are some less-than-ideal methods people have come up with to address the problem with resource reallocation when you want minimal job death. One is to set your PREEMPTION_REQUIREMENTS to allow preemption only during the first 10 minutes of the job and (if you _really_ can't live with jobs being killed), add a USER_JOB_WRAPPER script that sleeps for 10 minutes before starting jobs.

To my knowledge, there is no way to force the schedd to drop each claim after running a job. Anybody with a clever solution, please correct me!

--Dan

Maarten Ballintijn wrote:

Hello,

Most of our jobs are vanilla universe for the moment. In order not
to waste CPU time I'd like them to run to completion. I understand
how to configure PREEMPT and PREEMPTION_REQUIREMENTS etc. to avoid
killing the jobs. The catch is that schedd hangs on to the claim
even if the priority dictates another job should run.

Is there a way to have schedd relinquish the claim "between" to jobs,
either always or when appropriate?

Thanks for your help,

Maarten.