[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] How to always avoid evictions of jobs belonging to GROUP_NAMES
- Date: Sat, 16 Nov 2019 12:03:30 +0530
- From: Vikrant Aggarwal <ervikrant06@xxxxxxxxx>
- Subject: Re: [HTCondor-users] How to always avoid evictions of jobs belonging to GROUP_NAMES
If you are using startd based eviction then probably following conf may work for you :
If you are Groupname with JOB_TRANSFORM feature of schedd.Â
STARTD_JOB_ATTRS = $(STARTD_JOB_ATTRS) Groupname
RetirementTime = 60 * $(MINUTE)
LOCAL_Groupname = (MY.Groupname =?= "A" || MY.Groupname =?= "B")
TARGET_Groupname = (TARGET.Groupname =?= "A" || TARGET.Groupname =?= "B")
RANK = ifthenelse(!isundefined(TARGET.Groupname), $(TARGET_Groupname), $(LOCAL_Groupname))
You also need to setÂALLOW_PSLOT_PREEMPTION to TRUE on negotiator.Â
Thanks & Regards,
Hi Todd, All,
Thanks for this comprehensive explanation, everything is much clearer to
But I would like to take advantage of your expertise for create a
In practice we have three "Pipelines" (X,Y,Z) corrisponding to Group_Names.
I know that when run X uses 20 slots, Y uses 14 slots and Z all others
I would like X and Y jobs never evicted.
When start X, Y jobs can be evicted only Z jobs.
I'm fighting with this problem.
Can you point me to a simple solution?
must run on the same cluster. One of these must never be evicted. I
already know the number of slots that each queue uses when it is in run.
On 11/15/19 4:05 PM, Todd Tannenbaum wrote:
> On 11/15/2019 2:04 AM, Giuseppe Di Biase wrote:
>> Hi All,
> Hi Giuseppe!
> I think we will need more clarification on what policy you desire. We
> could help better if you could explain the scheduling policy you have in
> mind as simply as you can, ignoring HTCondor configuration issues at the
> moment.Â Once we understand what you want to happen, as a second step we
> can suggest what you add to your HTCondor config.
> More below...
>> i would like to avoid evictions of jobs belonging to a *user* in a
>> GROUP_NAMES defined.
> Under what conditions are you seeing jobs being evicted now?
> Under what conditions do you *want* HTCondor to evict jobs?Â As food for
> thought, some potential example answer(s) are
>Â Â Â1. Never
>Â Â Â2. If a job runs more than X amount of time (e.g. kill a "run-away" job)
>Â Â Â3. If a job runs more than X amount of time and some other job from a
> higher priority group or user is waiting to run
>Â Â Â4. If a job runs more than X amount of time, and HTCondor is trying
> to drain this node
>Â Â Â5. If a job uses more memory (RAM) than requested
>Â Â Â6. If a machine prefers to run specific types of jobs (e.g. GPU
> jobs), and such a preferred job is waiting to run
> Most people like to start out simple with (1), then add (2) and (5).
>> I defined this GROUP_NAMES in the node running Negotiator/Collector
>> daemon but how and where (daemons) i must define his priority respect
>> jobs belonging to others GROUP_NAMES?
> So you defined some accounting groups (with GROUP_NAMES), and now you
> wish to tell HTCondor how many resources each group should get?
> Lets say you have three groups X, Y, and Z.Â Do you want a strict
> priority across groups so that for instance X always gets machines ahead
> of Y, and Y always gets machines ahead of Z?Â ÂAlternatively perhaps you
> want to say that X should get 50% of cpus in pool, Y should get 40%, and
> Z should get 10%.Â Either policy is possible with HTCondor.Â Different
> users within the same group will get a proportional share (i.e.
> 'fair-share' across users in the same group). For explanation and
> examples see
> Once you have groups the way you want, you can decide if you want
> preemption or not.Â Most people do NOT want preemption.Â For instance,
> imagine you have two groups, A and B, and you want each to get 50% of
> your pool.Â Imagine users in group A are using the entire pool, and then
> suddenly a group B job is submitted.Â Without preemption, the group B
> job will wait until some group A job exits and then it will start.Â With
> preemption, HTCondor will kill a group A job and give the slot it was
> running on over to group B.Â Most organizations prefer to avoid
> preemption, because all the cycles consumed by the killed job might be
> wasted (if the job cannot checkpoint).
>> Can i use a formula like this in SCHEDD?
>> IsX = (Experiment =?= "X")
>> IsY = (Experiment =?= "Y")
>> IsZ = (Experiment =?= "Z")
>> RANK = $(X)*70 + $(Y)*10Â + $(Z)*8
> No, RANK _expression_ in your condor_config file is only used by the
> STARTD, not the schedd, and it always implies preemption (which I am
> guessing you do not want). If you wish to define priorities / quotas
> across different groups, you will want to use GROUP_QUOTA_xxxx settings
> in the configuration of your condor_negotiator.Â See the above reference
> to the HTCondor Manual for more info...
> Hope the above helps,
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: