[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] per user "quotas" using match making



Hi Todd,

this config does the trick. Thanks a lot.

I agree, preemption would do the better job.
But it only makes sense for short running jobs. Otherwise it's a waste of resources and time
(if jobs are not able to checkpoint).
And that's our dilemma.
Usually a job - most of them run in vanilla universe - lasts for several hours or even days,
so preemption is not really an option. Our users will not be amused, if their jobs get killed
after a certain amount of runtime or just if the jobs would have been almost finished.
So, running this solution of a "default quota" per user is only a disadvantage
if the resources of the pool are not totally occupied. That's right.
But one can't have everything :)
Nevertheless HTCondor is the best solution for non-dedicated clusters :)

Werner



On 02/11/2015 10:02 PM, Todd Tannenbaum wrote:
> On 2/10/2015 12:52 PM, Werner Hack wrote:
>> Hi all,
>>
>> I tried to limit the number of jobs running per user via a requirement definition
>> as described here: https://gist.github.com/dberzano/9995356
>>
>> Maybe this only works if static slots are used. But for partitionable slots it does not work in this
>> way.
>> Or maybe I missed to configure something else?
>>
>> The manual writes:
>> SubmitterUserResourcesInUse: The integer number of slots currently utilized by the user submitting
>> the candidate job.
>>
>> How is SubmitterUserResourcesInUse handled for partitionable slots? The same way?
>> Or is there a better way to set a simple user quota for running jobs
>> so that one user can not occupy the whole pool for a longer time?
>>
>> Any hint will be appreciated
>> Best
>> Werner
>>
>
> Hi Werner,
>
> I looked at your config settings on github.  My guess is it fails to work with paritionable slots
> because CLAIM_PARTITIONABLE_LEFTOVERS is True by default.  What this means is the negotiator will
> match a partitionable slot (pslot) with a job, and give the pslot to the schedd.  The schedd will
> then run as many jobs as possible on that pslot until some pslot resource (like cpu cores) is
> exhausted.  The result is your so-called normal users could run many jobs even if you only want
> them able to run one job (although they could only use one machine). I think it will work as you
> envisioned with partitionable slots if you add the following to your condor_config (on all your
> execute machines):
>
>  # Turn off claiming leftover resources by the schedd
>  # so that our quota via requirements magic works
>  CLAIM_PARTITIONABLE_LEFTOVERS = False
>
>  # Optionally turn on consumption_policy mechanism so more than
>  # one job can be matched per negotiation cycle to a
>  # machine considering that CLAIM_PARTITIONABLE_LEFTOVERS
>  # is disabled above.
>  CONSUMPTION_POLICY = True
>
> You can read about these knobs in the Manual, and/or you may find the following wisdom on the Wiki
> enlightening:
>   https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=ConsumptionPolicies
>
> Finally, the whole concept of placing a quota on the number of jobs a user can run seems against
> the idea of high-throughput computing.... Instead I'd suggest a policy that simply gives
> power-users a better priority, and allow users with a better priority to preempt jobs submitted by
> users with a worse priority.  This way, if a poor Todd is a normal user and nobody else even wants
> to use the pool, Todd isn't limited to just a few jobs for no good reason...  Of course, I
> understand that some jobs (especially non idempotent jobs  that have side-effects like creating
> records in database) don't like to be preempted, but I'd argue that is pretty rare.
>
> Hope this helps,
> Todd
>
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>
>
>

-- 


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature