[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Negotiator problem? Jobs not assigned to idle machines.



Hi all,
 
We have v6.8 on Windows XP machines. Preemption is disabled (3.6.10.5 of
the manual):
 
PREEMPT = False
PREEMPTION_REQUIREMENTS = False
RANK = 0
NEGOTIATOR_CONSIDER_PREEMPTION = False
 
and
 
CLAIM_WORKLIFE = 300
 

We have 32 machines and submitted about 40 cpusoak (from v6.6) jobs, all
from simulation@xxxxxxxxx Only 19 of them run and the rest are in the
queue sitting idle. Why do those jobs sit idle when there are no other
user with jobs and there are idle machines? How do we fill up the pool?
 
We have NEGOTIATOR_DEBUG = D_FULLDEBUG. Why is the NegotiatorLOG (below)
saying that of the 13 remaining idle startd's, none of them is assigned
to run simulation's jobs, i.e. "This schedd hit its scheddlimit."?
condor_q -analyze says that the jobs are  "match but reject the job for
unknown reasons".
 

Thanks,
Rick

 
 
7/29 10:18:01 ---------- Started Negotiation Cycle ----------
7/29 10:18:01 Phase 1:  Obtaining ads from collector ...
7/29 10:18:01   Getting all public ads ...
7/29 10:18:01 Trying to query collector <172.25.4.150:9618>
7/29 10:18:01   Sorting 70 ads ...
7/29 10:18:01   Getting startd private ads ...
7/29 10:18:01 Trying to query collector <172.25.4.150:9618>
7/29 10:18:01 Got ads: 70 public and 32 private
7/29 10:18:01 Public ads include 1 submitter, 32 startd
7/29 10:18:01 Entering compute_signficant_attrs()
7/29 10:18:01 Leaving compute_signficant_attrs() -
result=JobUniverse,LastCheckpointPlatform,NumCkpts
7/29 10:18:01 Phase 2:  Performing accounting ...
7/29 10:18:01 Trimmed out 19 startd ads not Unclaimed
7/29 10:18:01 Phase 3:  Sorting submitter ads by priority ...
7/29 10:18:01 Phase 4.1:  Negotiating with schedds ...
7/29 10:18:01     NumStartdAds = 13
7/29 10:18:01     NormalFactor = 1.000000
7/29 10:18:01     MaxPrioValue = 0.990605
7/29 10:18:01     NumScheddAds = 1
7/29 10:18:01   Negotiating with simulation@xxxxxxxx at
<172.25.4.150:4557>
7/29 10:18:01 0 seconds so far
7/29 10:18:01   Calculating schedd limit with the following parameters
7/29 10:18:01     ScheddPrio       = 0.990605
7/29 10:18:01     ScheddPrioFactor = 1.000000
7/29 10:18:01     scheddShare      = 1.000000
7/29 10:18:01     scheddAbsShare   = 1.000000
7/29 10:18:01     ScheddUsage      = 19
7/29 10:18:01     scheddLimit      = 0
7/29 10:18:01     MaxscheddLimit   = 0
7/29 10:18:01 Socket to <172.25.4.150:4557> already in cache, reusing
7/29 10:18:01     Reached submitter resource limit: 0 ... stopping
7/29 10:18:01   This schedd hit its scheddlimit.
7/29 10:18:01 ---------- Finished Negotiation Cycle ----------





********************** Legal Disclaimer ****************************
"This email may contain confidential and privileged material for the sole use of the intended recipient.  Any unauthorized review, use or distribution by others is strictly prohibited.  If you have received the message in error, please advise the sender by reply email and delete the message. Thank you."
**********************************************************************