[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] default host ranking



On 12/1/2014 10:22 AM, Yngve Levinsen wrote:
Hi all,

I would like to set up a default rank of the hosts in our pool (unless
the user specifies another ranking). Where is this set?

You can use knob NEGOTIATOR_POST_JOB_RANK for this purpose. It may be helpful to see the recipe
  https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToSteerJobs

Currently we
only have two machines, the central host and “host2”. It looks to me
like HTCondor is always filling up the slots on “host2” before starting
to use the slots on the central machine. I suppose this makes sense for
larger pools where you want to keep the resources on the central host
free for as long as possible. However, we would currently like it to be
opposite (or even, distribute the jobs evenly between hosts in the pool)


On your execute machines, are you using static slots (the default), or partitionable slots?

Assuming static slots, the above recipe gives an example NEGOTIATOR_POST_JOB_RANK to distribute the jobs evenly between hosts in the pool.

Further, I tried to use the “rank” parameter in the job file without
success. I added this line to the job configuration file:

/rank = ( 2 * (machine == “host1”) ) + (machine == “host2”)/

With this condor was still populating the slots on host2 before using
the slots on host1. I then figured maybe there is some other ranking
done, such that I need to increase the number. However, even
/rank = ( 1000 * (machine == “host1”) ) + (machine == “host2”)/
or
/rank = ( 1000 * (machine == “host1”) ) /
changed anything (that I noticed).

That made me think that maybe I was simply using wrong hostnames, so I
added them to the “requirements” instead. That worked, unless I wrote
“host1” and/or “host2” (and spelled correctly), the respective hosts
would not be used.

Is ranking not turned on by default, or is there something else I might
be missing?


I would have expected the above to work (assuming static slots).

What version of HTCondor are you using? There was a bug related to job rank what was fixed starting with v8.2.2 which may be causing you problems. See
   https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=4403

Assuming static slots and no jobs running on host1 or host2, what happens if you try the following submit file?

requirements = machine == "host1" || machine == "host2" && ( Rank =!= UNDEFINED )

  rank  = 1000 - SlotID

Also, does the following command
   condor_status -af machine
display "host1" and "host2" spelled the same etc as what you used in your job submit file?

Hope the above helps,
Todd



In case I explain myself incorrectly I attach my job configuration file.

Cheers,
Yngve



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685