[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Why could this match not be made?



I have a set of jobs that target one specific machine in my pool. The
machine has:

START = (Target.Owner =?= "ichesal)

So it should only run my jobs. The machine has 2 VMs running on it. They
are both Unclaimed and Idle. I submit a cluster of 10 jobs that require
this specific machine. The negotiator assigns the 144.0 process to VM2
on the machine, but then says there are no matches for the 144.1 process
and stops negotiating the cluster.

VM1 is clearly free, why didn't 144.1 get assigned to this VM? What was
"no match found" returned?

In the negotiator log I see:

2/28 18:50:31   Negotiating with ichesal@xxxxxxxxxx at
<137.57.142.112:38437>
2/28 18:50:31   Calculating schedd limit with the following parameters
2/28 18:50:31     ScheddPrio       = 49949084.000000
2/28 18:50:31     ScheddPrioFactor = 1.000000
2/28 18:50:31     scheddShare      = 0.000000
2/28 18:50:31     scheddAbsShare   = 0.066667
2/28 18:50:31     ScheddUsage      = 1
2/28 18:50:31     scheddLimit      = 500000
2/28 18:50:31     MaxscheddLimit   = 500000
2/28 18:50:31 Socket to <137.57.142.112:38437> already in cache, reusing
2/28 18:50:31     Sending SEND_JOB_INFO/eom
2/28 18:50:31     Getting reply from schedd ...
2/28 18:50:31     Got JOB_INFO command; getting classad/eom
2/28 18:50:31     Request 00144.00000:
2/28 18:50:31       Connecting to startd vm2@xxxxxxxxxxxxxxxxxxxxxxxxx
at <137.57.176.228:1421>
2/28 18:50:31 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default
value of 0
2/28 18:50:31 SEC_DEBUG_PRINT_KEYS is undefined, using default value of
False
2/28 18:50:31       Sending MATCH_INFO/capability
2/28 18:50:31       (Capability is "<137.57.176.228:1421>#1109624347#24"
)
2/28 18:50:31       Sending PERMISSION, capability, startdAd to schedd
2/28 18:50:31       Notifying the accountant
2/28 18:50:31       Successfully matched with
vm2@xxxxxxxxxxxxxxxxxxxxxxxxx
2/28 18:50:31     Sending SEND_JOB_INFO/eom
2/28 18:50:31     Getting reply from schedd ...
2/28 18:50:31     Got JOB_INFO command; getting classad/eom
2/28 18:50:31     Request 00144.00001:
2/28 18:50:31       Rejected 144.1 ichesal@xxxxxxxxxx
<137.57.142.112:38437>: no match found
2/28 18:50:31     Sending SEND_JOB_INFO/eom
2/28 18:50:31     Getting reply from schedd ...
2/28 18:50:31     Got NO_MORE_JOBS;  done negotiating
2/28 18:50:31   Schedd ichesal@xxxxxxxxxx got all it wants; removing it.

And in my schedd log for the submitting machine I see:

2/28 18:58:59 Activity on stashed negotiator socket
2/28 18:58:59
2/28 18:58:59 Entered negotiate
2/28 18:58:59 NEGOTIATOR_TIMEOUT is undefined, using default value of 20
2/28 18:58:59 *** SwapSpace = 2436404
2/28 18:58:59 *** ReservedSwap = 5120
2/28 18:58:59 *** Shadow Size Estimate = 1800
2/28 18:58:59 *** Start Limit For Swap = 1350
2/28 18:58:59 *** Current num of active shadows = 0
2/28 18:58:59 Negotiating for owner: ichesal@xxxxxxxxxx
2/28 18:58:59 Checking consistency running and runnable jobs
2/28 18:58:59 Tables are consistent
2/28 18:58:59 Sent job 144.0 (autocluster=144)
2/28 18:58:59 In case PERMISSION
2/28 18:58:59 Enqueued contactStartd startd=<137.57.176.228:1421>
2/28 18:58:59 Sent job 144.1 (autocluster=144)
2/28 18:58:59 Job 144.1 rejected: no match found
2/28 18:59:00 Out of servers - 1 jobs matched, 9 jobs idle, 1 jobs
rejected

- Ian