[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Occasional stalls in the negotiation cycle



Every few negotiation cycles, something happens that is causing a ~10min pause. It always seems to occur when "Getting all public" ads, as is shown below from the NegotiatorLog.

4/19 14:31:25 ---------- Started Negotiation Cycle ----------
4/19 14:31:25 Phase 1:  Obtaining ads from collector ...
4/19 14:31:25   Getting all public ads ...
4/19 14:40:49   Sorting 123 ads ...
4/19 14:40:49   Getting startd private ads ...
4/19 14:42:18 Got ads: 123 public and 67 private
4/19 14:42:18 Public ads include 0 submitter, 67 startd
4/19 14:42:18 Phase 2:  Performing accounting ...
4/19 14:42:18 Phase 3:  Sorting submitter ads by priority ...
4/19 14:42:18 Phase 4.1:  Negotiating with schedds ...
4/19 14:42:18 ---------- Finished Negotiation Cycle ----------

Hopefully related, 'condor_status' currently takes a LONG time (2min+, instead of only a few seconds) to return any information, and I'm seeing a lot of the following in the CollectorLog.

4/19 14:44:08 Buf::write(): condor_write() failed
4/19 14:44:08 SECMAN: Error sending response classad!

Any suggestions as to where I could/should focus my debugging efforts would be much appreciated! :)