[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Questions about resource allocation and user priorites...





Jonathan Giddy wrote:

Michael S. Root wrote:



The thing is, the individual jobs that make up each dag usually only take
between 1 and 20 minutes each, but there can be several hundred jobs per
dag. Why don't the second user's jobs acquire resources as the first
user's dag's individual "sub-jobs" finish, which they do fairly often?


Currently, there is no way to tell Condor you want preemption of
resource claims to be considered between job boundaries. However, you
_can_ set PREEMPTION_REQUIREMENTS to allow preemption during the first
few minutes of a job's life, instead of the default expression which is
to allow preemption after the claim has existed for one hour:



When a job completes, the machine returns to the Claimed/Idle state, where the START expression is checked before a new job for the user is started.

So I add the following to my START expression:

START = $(START) && ( State != "Claimed" || $(StateTimer) < 30 * $(MINUTE) )

If the machine has been in the Claimed state for more than half an hour (a
sign that at least one job has been run for the user), this evaluates to
False and the Machine is returned to the Unclaimed state, where it can be
matched to any user.


Very good point!


The effect of this is to relinquish the user's claim after 30 minutes when a job finishes (whether or not anybody else is in line to use the machine). For a user with a lot of jobs in the queue, you will see a little more latency, since the claim will need to be re-established each time. This delay will be on the order of a negotiation cycle--about 5 minutes by default. On the positive side, you get potentially faster reallocation of resources without any additional jobs being killed.

Dan Bradley


Condor Support Information: http://www.cs.wisc.edu/condor/condor-support/ To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with unsubscribe condor-users <your_email_address>