[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Error: Could not connect to negotiator ((null))



Hi

Condor is frequently giving this message.
Error: Could not connect to negotiator ((null)). So the Job for Idle State for some time till the Negotiator is available.

#condor_status -negotiator
       gives nothing.

That time at Negotiator Log is at
10/15 18:41:36 ---------- Finished Negotiation Cycle ----------

After some time the 
#condor_status -negotiator
  give the Machine Name which is Running Negotiator
condor_q -better -analyze command also works fine.

That time Negotiator Log is
10/15 18:49:07 ---------- Finished Negotiation Cycle ----------
10/15 18:50:31 enter Matchmaker::updateCollector
10/15 18:50:31 Trying to update collector <10.201.42.242:9618>
10/15 18:50:31 Attempting to send update via UDP to collector scorpio.pesgrid.wipro.com <10.201.42.242:9618>
10/15 18:50:31 Trying to update collector <10.201.42.238:9618>
10/15 18:50:31 Attempting to send update via UDP to collector grid8.pesgrid.wipro.com <10.201.42.238:9618>
10/15 18:50:31 exit Matchmaker::UpdateCollector
10/15 18:51:03 Getting state information from the accountant
10/15 18:51:06 Getting state information from the accountant
10/15 18:51:07 ---------- Started Negotiation Cycle ----------
10/15 18:51:07 Phase 1:  Obtaining ads from collector ...
10/15 18:51:07   Getting all public ads ...


I think the Negotiator has finished it Negotiation cycle.


This is Part of Negotiator Log File......

10/15 18:39:06 ---------- Finished Negotiation Cycle ----------
10/15 18:40:31 enter Matchmaker::updateCollector
10/15 18:40:31 Trying to update collector <10.201.42.242:9618>
10/15 18:40:31 Attempting to send update via UDP to collector scorpio.pesgrid.wipro.com <10.201.42.242:9618>
10/15 18:40:31 Trying to update collector <10.201.42.238:9618>
10/15 18:40:31 Attempting to send update via UDP to collector grid8.pesgrid.wipro.com <10.201.42.238:9618>
10/15 18:40:31 exit Matchmaker::UpdateCollector
10/15 18:40:36 Getting monitoring info for pid 6169
10/15 18:41:06 ---------- Started Negotiation Cycle ----------
10/15 18:41:06 Phase 1:  Obtaining ads from collector ...
10/15 18:41:06   Getting all public ads ...
10/15 18:41:06 Trying to query collector <10.201.42.242:9618>
10/15 18:41:06   Sorting 26 ads ...
10/15 18:41:06   Getting startd private ads ...
10/15 18:41:06 Trying to query collector <10.201.42.242:9618>
10/15 18:41:06 Got ads: 26 public and 9 private
10/15 18:41:06 Public ads include 2 submitter, 9 startd
10/15 18:41:06 Entering compute_signficant_attrs()
10/15 18:41:06 Leaving compute_signficant_attrs() - result=JobUniverse,LastCheckpointPlatform,NumCkpts
10/15 18:41:06 Phase 2:  Performing accounting ...
10/15 18:41:06 Phase 3:  Sorting submitter ads by priority ...
10/15 18:41:06 Phase 4.1:  Negotiating with schedds ...
10/15 18:41:06     NumStartdAds = 9
10/15 18:41:06     NormalFactor = 1.000000
10/15 18:41:06     MaxPrioValue = 3.083870
10/15 18:41:06     NumScheddAds = 2
10/15 18:41:06   Negotiating with idealgrid@xxxxxxxxxxxxxxxxx skipped because no idle jobs
10/15 18:41:06   Schedd idealgrid@xxxxxxxxxxxxxxxxx got all it wants; removing it.
10/15 18:41:06   Negotiating with idealgrid@xxxxxxxxxxxxxxxxx at <10.201.42.247:9661>
10/15 18:41:06 0 seconds so far
10/15 18:41:06   Calculating schedd limit with the following parameters
10/15 18:41:06     ScheddPrio       = 3.083870
10/15 18:41:06     ScheddPrioFactor = 1.000000
10/15 18:41:06     scheddShare      = 0.000000
10/15 18:41:06     scheddAbsShare   = 1.000000
10/15 18:41:06     ScheddUsage      = 4
10/15 18:41:06     scheddLimit      = 5
10/15 18:41:06     userprioCrumbs   = 0 (0)
10/15 18:41:06     MaxscheddLimit   = 5
10/15 18:41:06 Socket to <10.201.42.247:9661> not in cache, creating one
10/15 18:41:06 attempt to connect to <10.201.42.247:9661> failed: No route to host (connect errno = 113).  Will keep trying for 30 total seconds (30 to go).

10/15 18:41:36 attempt to connect to <10.201.42.247:9661> failed: No route to host (connect errno = 113).
10/15 18:41:36     Failed to connect to <10.201.42.247:9661>
10/15 18:41:36   Error: Ignoring schedd for this cycle
10/15 18:41:36 ---------- Finished Negotiation Cycle ----------
10/15 18:43:36 ---------- Started Negotiation Cycle ----------
10/15 18:43:36 Phase 1:  Obtaining ads from collector ...
10/15 18:43:36   Getting all public ads ...
10/15 18:43:36 Trying to query collector <10.201.42.242:9618>
10/15 18:43:36   Sorting 23 ads ...
10/15 18:43:36   Getting startd private ads ...
10/15 18:43:36 Trying to query collector <10.201.42.242:9618>
10/15 18:43:36 Got ads: 23 public and 9 private
10/15 18:43:36 Public ads include 2 submitter, 9 startd
10/15 18:43:36 Entering compute_signficant_attrs()
10/15 18:43:36 Leaving compute_signficant_attrs() - result=JobUniverse,LastCheckpointPlatform,NumCkpts
10/15 18:43:36 Phase 2:  Performing accounting ...
10/15 18:43:36 Phase 3:  Sorting submitter ads by priority ...
10/15 18:43:36 Phase 4.1:  Negotiating with schedds ...
10/15 18:43:36     NumStartdAds = 9
10/15 18:43:36     NormalFactor = 1.000000
10/15 18:43:36     MaxPrioValue = 3.084972
10/15 18:43:36     NumScheddAds = 2
10/15 18:43:36   Negotiating with idealgrid@xxxxxxxxxxxxxxxxx skipped because no idle jobs
10/15 18:43:36   Schedd idealgrid@xxxxxxxxxxxxxxxxx got all it wants; removing it.
10/15 18:43:36   Negotiating with idealgrid@xxxxxxxxxxxxxxxxx at <10.201.42.247:9661>
10/15 18:43:36 0 seconds so far
10/15 18:43:36   Calculating schedd limit with the following parameters
10/15 18:43:36     ScheddPrio       = 3.084972
10/15 18:43:36     ScheddPrioFactor = 1.000000
10/15 18:43:36     scheddShare      = 0.000000
10/15 18:43:36     scheddAbsShare   = 1.000000
10/15 18:43:36     ScheddUsage      = 4
10/15 18:43:36     scheddLimit      = 5
10/15 18:43:36     userprioCrumbs   = 0 (0)
10/15 18:43:36     MaxscheddLimit   = 5
10/15 18:44:06   Error: Ignoring schedd for this cycle
10/15 18:44:06 ---------- Finished Negotiation Cycle ----------


by
Johnson

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com