[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Anti Affinity
- Date: Sat, 31 Jan 2009 23:11:02 +0530
- From: Sateesh Potturu <sateeshpnv@xxxxxxxxx>
- Subject: Re: [Condor-users] Anti Affinity
I was able to achieve anti affinity with the approach that I mentioned
earlier; but not along with partitionable slots.
When I use STARTD_SLOT_ATTRS along with partitionable slots, I see the
classads like below
slot1.3_Cmd = "/home/sateesh/tmp/sh_loop2"
slot1.2_Cmd = "/home/sateesh/tmp/sh_loop2"
slot1.1_Cmd = "/home/sateesh/tmp/sh_loop2"
Should it not be slot1_3_Cmd, slot1_2_Cmd, slot1_1_Cmd? ( _ instead of . )
I have STARTD_JOB_EXPRS include Cmd and STARTD_SLOT_ATTRS also include
Cmd. My job requirements contains (TARGET.slot1_Cmd =!=
"/home/sateesh/tmp/sh_loop2") (repeated for each slot). With this, I
was able to achieve that anti affinity I was asking about. Mailing
list archives had configurations for uniform distribution based on
RANK. But, I wanted condor to not start an executable more than once
on any execute node. So, I used Cmd.
I tested this anti affinity without partitionable slot and it works
good; as I expected. With partitionable slots, I suspect the check
against TARGET.slot1.1_Cmd fails because "." is a seperator.
I tested my observation with condor_status -constraint and it matches.
I think anti affinity requires both requirements and rank --
Requirements to prevent the two instances starting on a same physical
machine and Rank to have breadth-first job distribution.
On Sat, Jan 24, 2009 at 2:50 AM, Matthew Farrellee <matt@xxxxxxxxxx> wrote:
> Sateesh Potturu wrote:
>> How can I get anti affinity behavior for jobs?
>> If I have two jobs (A and B) and two machines with two CPUs each, how
>> can I control the jobs such that both job A and job B don't run on the
>> same execute machine.
>> Can I control this using STARTD_JOB_EXPRS? I tried adding Cmd to this
>> config variable without success. But, startd reports
>> "Job wants DaemonCore starter, skipping
>> "slot1.1: Job Requirements check failed!"
> You might check the archives for discussions about uniformly
> distributing jobs and/or tightly packing them.
> There were some good example configurations.
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at: