[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] How to always avoid evictions of jobs belonging to GROUP_NAMES
- Date: Fri, 15 Nov 2019 15:05:16 +0000
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] How to always avoid evictions of jobs belonging to GROUP_NAMES
On 11/15/2019 2:04 AM, Giuseppe Di Biase wrote:
> Hi All,
I think we will need more clarification on what policy you desire. We
could help better if you could explain the scheduling policy you have in
mind as simply as you can, ignoring HTCondor configuration issues at the
moment. Once we understand what you want to happen, as a second step we
can suggest what you add to your HTCondor config.
> i would like to avoid evictions of jobs belonging to a *user* in a
> GROUP_NAMES defined.
Under what conditions are you seeing jobs being evicted now?
Under what conditions do you *want* HTCondor to evict jobs? As food for
thought, some potential example answer(s) are
2. If a job runs more than X amount of time (e.g. kill a "run-away" job)
3. If a job runs more than X amount of time and some other job from a
higher priority group or user is waiting to run
4. If a job runs more than X amount of time, and HTCondor is trying
to drain this node
5. If a job uses more memory (RAM) than requested
6. If a machine prefers to run specific types of jobs (e.g. GPU
jobs), and such a preferred job is waiting to run
Most people like to start out simple with (1), then add (2) and (5).
> I defined this GROUP_NAMES in the node running Negotiator/Collector
> daemon but how and where (daemons) i must define his priority respect
> jobs belonging to others GROUP_NAMES?
So you defined some accounting groups (with GROUP_NAMES), and now you
wish to tell HTCondor how many resources each group should get?
Lets say you have three groups X, Y, and Z. Do you want a strict
priority across groups so that for instance X always gets machines ahead
of Y, and Y always gets machines ahead of Z? Alternatively perhaps you
want to say that X should get 50% of cpus in pool, Y should get 40%, and
Z should get 10%. Either policy is possible with HTCondor. Different
users within the same group will get a proportional share (i.e.
'fair-share' across users in the same group). For explanation and
Once you have groups the way you want, you can decide if you want
preemption or not. Most people do NOT want preemption. For instance,
imagine you have two groups, A and B, and you want each to get 50% of
your pool. Imagine users in group A are using the entire pool, and then
suddenly a group B job is submitted. Without preemption, the group B
job will wait until some group A job exits and then it will start. With
preemption, HTCondor will kill a group A job and give the slot it was
running on over to group B. Most organizations prefer to avoid
preemption, because all the cycles consumed by the killed job might be
wasted (if the job cannot checkpoint).
> Can i use a formula like this in SCHEDD?
> IsX = (Experiment =?= "X")
> IsY = (Experiment =?= "Y")
> IsZ = (Experiment =?= "Z")
> RANK = $(X)*70 + $(Y)*10Â + $(Z)*8
No, RANK expression in your condor_config file is only used by the
STARTD, not the schedd, and it always implies preemption (which I am
guessing you do not want). If you wish to define priorities / quotas
across different groups, you will want to use GROUP_QUOTA_xxxx settings
in the configuration of your condor_negotiator. See the above reference
to the HTCondor Manual for more info...
Hope the above helps,