[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] On a job starving issue



Hi John,

Thanks for the information. I will get it a trial. I am curious if changingÂADD_SIGNIFICANT_ATTRIBUTES and condor_reconfig the submitter will change the job clusters of the jobs in the queue.




On Tue, May 7, 2019 at 9:49 AM John M Knoeller <johnkn@xxxxxxxxxxx> wrote:

You can force any attribute to be part of the set used for autoclustering, so that âgoodâ jobs will never be autoclustered with âbadâ jobs.

Â

just configure

Â

ADD_SIGNIFICANT_ATTRIBUTES = <Attr1> Â<Attr2> Â<ETC>

Â

On your submit machine. <Attr1> and <Attr2>, etc above are attributes that you want to be used for autoclustering that

are not currently used and can distinguish between your âgoodâ and your âbadâ jobs.

Â

-tj

Â

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Weiming Shi
Sent: Monday, May 6, 2019 10:21 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] On a job starving issue

Â

Hi HTCondor Community,

Â

Is there a way to disable the autoclustering during the matchmaking of condor jobs or a way to re-initiate the matchmaking when the runnable queue is not changed?

Â

Our main motivation is to prevent the 'good' jobs (which should be scheduled) from being clusteredÂwith the 'bad' jobs (which should be rejected) whenÂthe significant attributes of the 'bad' jobs and 'good' jobs are not sufficient toÂseparateÂthem into two different clusters.

Â

A starving issue can happen when we have multiple pools and we enable the jobs to be flocked to multiple pools concurrently (by setting flock_increment = #pools).ÂSince the jobs can be flocked to multiple pools concurrently, the order that a pool master is negotiated with is no longerÂdeterministic.ÂWhen the 'bad' jobs are rejected because of the resource capacity of a particular pool, the 'good' jobs that are clustered with 'bad' jobs are also rejected and not reconsidered for matchmaking with other pools that have idle resource and no capacity issue.Â

Â

Thanks

Â

Weiming

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/