[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Faulty node and idle state



Dear all,

I encountered (a solved) problem of a faulty compute node that had some troubles to be reached by the scheduler, but that was able to validate the acceptation of the job to the central manager that is on another machine.

The job failed in idle state; and looking at the scheduler log, the job was always resubmitted to the same node for hours. Hence, I was wandering if there was a possibility to avoid this kind of behaviour in the configuration of the scheduler / central manager, ie that the scheduler asks the central manager another node to compute after having the job staying in idle state for a while, not started, and that always the same node has responded to the central manager?

HTCondor version is 8.8.15-1

Best regards,

Xavier