[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor 8.6.3 - Jobs evicted even if other slots are free



Hi Jaime,

thanks for your attentions, i will reply you on line.



On 4/5/19 10:36 PM, Jaime Frey wrote:
That setting for NEGOTIATOR_PRE_JOB_RANK should make the negotiator prefer unused machines.

In my case:

condor_config_val NEGOTIATOR_PRE_JOB_RANK
RemoteOwner =?= UNDEFINED


Are your execute machines configured to have static slots or partitionable slots?
We have partitionable slots.
IsÂALLOW_PSLOT_PREEMPTION set in your configuration?
[root@condorcl1 ~]# condor_config_val ALLOW_PSLOT_PREEMPTION
Not defined: ALLOW_PSLOT_PREEMPTION


Have you tried running condor_q -analyze to ensure that the jobs match all of the idle machines?

Yes



The first error message suggests that COLLECTOR_HOST isnât set on the central manager.
How can i fix it?

For the second error message, it sounds like a client timed out trying to connect to the negotiator while the negotiator was busy. The two things that would talk to the negotiator are condor_userprio and condor_schedd.

How can i fix it?



Thanks a lot

Giuseppe




Â- Jaime

On Apr 3, 2019, at 2:07 AM, Giuseppe Di Biase <giuseppe.dibiase@xxxxxxxxx> wrote:

Hi Jaime,

on Negotiator machines:

[root@condorcl1 condor]# condor_config_val NEGOTIATOR_PRE_JOB_RANK
RemoteOwner =?= UNDEFINED


We did't touch it and jobs don't use all free slots.

I can add that on NegotiatorLog we have a lot of:

"Failing attempt to update or invalidate collector ad because of missing daemon address (probably an unresolved hostname; daemon name is '""')."

and

"DaemonCore: Can't receive command request from 90.147.139.40 (perhaps a timeout?)"

where 90.147.139.40 is Negotiator IP Addr.

and we are not able to fix.


How we can force to use free slots then?

Thanks again

Giuseppe


On 4/2/19 11:48 PM, Jaime Frey wrote:
Because of your machine RANK _expression_, DetChar jobs will always preemptÂNoEMiTest jobs on any machine. So the key to preventing theÂundesirable preemptions is to ensure that when the negotiator sorts all of the machines that match a DetChar job, the idle machines appear at the top of the list.ÂWith the default settings and the local config changes listed, that should be happening.Â

The configuration parameterÂNEGOTIATOR_PRE_JOB_RANK is how you enforce a particular ordering of machines for each job being matched. If all of your machines have the same RANK _expression_, then the default value for NEGOTIATOR_PRE_JOB_RANK should make the negotiator to prefer matching idle machines. Have you changed this setting?