We have a flock of over 40 machines (Linux & Windows) spread over 2 countries (A & B). With a couple of HAD-enabled central managers to the flock located in country A. All nodes are partitionable.
Recently we’ve tried to set a Rank _expression_ in sub files so machines in country B would get preferential matchmaking for jobs. We know from condor_status -server that most machines in country B have a higher mips than those in country A
As advised by the manual at https://htcondor.readthedocs.io/en/stable/users-manual/submitting-a-job.html#about-requirements-and-rank, we’ve tried setting the _expression_ to Rank = mips and did some tests with quick 5min jobs, submitting in both countries at time the machines were not in use by our scientists.
Despite this value, when submitting from country A, most execution nodes for most (but not all) the runs were located in country A.
Similarly, when submitting from country B, most execution nodes for most (but not all) the runs were located in country B.
If the flock is fully in use on one side, then the jobs get matched to the other side without any issue.
It seems to me at this point that the Rank _expression_ does not do much and some other criteria is used instead.
We’ve also set a custom property into the ClassAd of each machine which sets the country it is located in, i.e.: Realm = "COUNTRY_A" or Realm = "COUNTRY_B". This way we can target a specific side of the flock through Requirements. It works good when we want to submit to one side only.
I’ve changed the rank _expression_ to Rank = mips + ( 10000 * (TARGET.Realm == " COUNTRY_B ")) in order to test if this would change anything, to no avail. The observed behavior is almost always the same.
Am I missing something when specifying Rank in the submit file? Is there anything else that needs to be done on the Central Managers?
Thanks for your help.
As part of our emissions reduction strategy, please only print this email if necessary