[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Prioritizing machines for match making



Hello Experts,

We are using scheduler level splitting for job match making.

Objective: We want to steer the job towards more busy machines.Â

Setup Details:Â
We are dynamically spinning up the nodes in the cloud based on the number of jobs in the queue. Once nodes are up, we keep on sending more jobs as the existing batch completes. We want to pack new jobs on machines where fewer cores are available so that one first batch with more jobs complete so that we can kill the vacant machines..Â

condor_config_val NEGOTIATOR_PRE_JOB_RANK (10000000 * My.Rank) + (1000000 * (RemoteOwner =?= UNDEFINED)) - (100000 * Cpus) - Memory

Jobs are randomly getting allocated to the nodes. Ex: I was expecting a node with 372604 NEGOTIATOR_PRE_JOB_RANK value would be preferred to run the jobs but it's picking the node with value in negative.Â

[2024-02-23 07:08:49 root@xxxxxxxxxxxxxxxxxxxxxxxxxxx ~]# condor_status -compact -af '(10000000 * My.Rank) + (1000000 * (RemoteOwner =?= UNDEFINED)) - (100000 * Cpus) - Memory' machine rank remoteowner cpus memory | sort -r -n Â-k1,1 | head
372604.0 c7condor1-default-us-c1-qb20.c.test.internal 0.0 undefined 6 27396
-147876.0 c7condor1-default-us-c1-zkz5.c.test.internal 0.0 undefined 11 47876

How can I force the behavior where nodes with less number of available cpus preferred?Â

Thanks & Regards,
Vikrant Aggarwal