[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Another jobs stuck in idle issue



Ben

Thanks for your response.

Replies embedded below...

Thanks.

Roderick

On 18/06/15 22:32, Ben Cotton wrote:
On Thu, Jun 18, 2015 at 7:43 AM, Roderick Johnstone <rmj@xxxxxxxxxxxxx> wrote:

Roderick,

I just updated my condor cluster from Fedora 20 to Fedora 22 (x86_64).

Condor is installed from the Fedora repos and went from 8.1.1 to 8.3.1.

Are both the submit host and the execute node running the same
HTCondor version?

Yes.

Is it possible that the upgrade changed some
firewall settings that blocks the starter from talking to the schedd?

I just checked the firewalls and thats not the problem.

A few more lines from the StartLog (before what you originally shared)
might help.

May I supply this to you off-list since its rather verbose (I turned on D_FULLDEBUG) and there are quite a lot of settings listed relating to the exact host and configuration.

I'm seeing a:
06/19/15 17:23:13 slot1_1: Slot requirements not satisfied.
but the job was matched to the execute host otherwise it wouldn't have tried to run it surely.