[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] On a job starving issue



You can force any attribute to be part of the set used for autoclustering, so that âgoodâ jobs will never be autoclustered with âbadâ jobs.

 

just configure

 

ADD_SIGNIFICANT_ATTRIBUTES = <Attr1>  <Attr2>  <ETC>

 

On your submit machine.  <Attr1> and <Attr2>, etc above are attributes that you want to be used for autoclustering that

are not currently used and can distinguish between your âgoodâ and your âbadâ jobs.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Weiming Shi
Sent: Monday, May 6, 2019 10:21 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] On a job starving issue

 

Hi HTCondor Community,

 

Is there a way to disable the autoclustering during the matchmaking of condor jobs or a way to re-initiate the matchmaking when the runnable queue is not changed?

 

Our main motivation is to prevent the 'good' jobs (which should be scheduled) from being clustered with the 'bad' jobs (which should be rejected) when the significant attributes of the 'bad' jobs and 'good' jobs are not sufficient to separate them into two different clusters.

 

A starving issue can happen when we have multiple pools and we enable the jobs to be flocked to multiple pools concurrently (by setting flock_increment = #pools). Since the jobs can be flocked to multiple pools concurrently, the order that a pool master is negotiated with is no longer deterministic. When the 'bad' jobs are rejected because of the resource capacity of a particular pool, the 'good' jobs that are clustered with 'bad' jobs are also rejected and not reconsidered for matchmaking with other pools that have idle resource and no capacity issue. 

 

Thanks

 

Weiming