[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] 6.9.2 startup error

Hi Dan,

Yes, gadget is the scheduler and the log was produced by that machine.
I took a look at the negotiator's log to see some trace of this communication problem and I found this:

5/24 14:52:47 ---------- Started Negotiation Cycle ----------
5/24 14:52:47 Phase 1:  Obtaining ads from collector ...
5/24 14:52:47 Negotiating with szabolcs@xxxxxxxxxxxxxxxxxxx at <>
5/24 14:52:47 0 seconds so far
5/24 14:52:47 condor_read(): recv() returned -1, errno = 10054, assuming failure reading 5 bytes from <>.
5/24 14:52:47     Failed to get reply from schedd
5/24 14:52:47   Error: Ignoring schedd for this cycle
5/24 14:52:47 ---------- Finished Negotiation Cycle ----------

I guess if the negotiator can negotiate with the computer using the IP than it had to connect with it somehow.
Than what might cause the problem when waiting for the reply?


Is gadget.digicpictures.local the name of the host that this SchedLog was produced on? If so, then this sounds to me like the schedd trying to directly claim its "local" startd, because it hasn't successfully communicated with the negotiator for a long time. How long is controlled by SCHEDD_ASSUME_NEGOTIATOR_GONE, which defaults to 1200 seconds.