[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] slots on SMP machine always in state Owner



A follow up to my previous post about problems with DedicatedScheduler on an SMP machine:

here's an excerpt from the StartLog file on the SMP host:

...
8/27 21:34:27 slot5: Request to claim resource refused.
8/27 21:34:27 slot5: Machine requirements not satisfied.
8/27 21:34:27 slot5: State change: claiming protocol failed
8/27 21:34:27 slot5: Changing state: Matched -> Owner
8/27 21:34:27 slot6: Request to claim resource refused.
8/27 21:34:27 slot6: Machine requirements not satisfied.
8/27 21:34:27 slot6: State change: claiming protocol failed
8/27 21:34:27 slot6: Changing state: Matched -> Owner
...

if I submit a regular Vanilla job, the resource is matched and claimed just fine - why is it having a problem claiming it when using the parallel universe?

Rok

Rok Roškar wrote:
I've got a 16-cpu SMP machine that is being configured as follows:
----------------
local.config:
DedicatedScheduler = "DedicatedScheduler@hostname"
STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler

#SLOTS_CONNECTED_TO_CONSOLE = 16
#SLOTS_CONNECTED_TO_KEYBOARD = 0

MachineClass = "SMP"
STARTD_ATTRS = $(STARTD_ATTRS), MachineClass

START = JobClass =!= UNDEFINED && JobClass == "SMP"

Rank = Scheduler =?= $(DedicatedScheduler)
------------------------------------------------------------------------------------------
Even after the machine has been sitting idle for a while (an hour or so), none of the slots are listed as "Unclaimed" instead they're always in state "Owner". Any idea why this is? As a result, jobs submitted to the parallel universe never start because even though the DedicatedScheduler is matched with the correct resources, the job isn't allowed to start (regardless of the START expression ignoring anything to do with keyboard or console activity).

Any thoughts on what I'm missing here are most welcome!

Thanks,

Rok

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/