[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Partioning into slot type -1 (minus one)?



On 09/02/2011 03:28 PM, Carsten Aulbert wrote:
Hi all

I'm currently playing with partionable slots with 7.6.3 again and find
something strange in the log files:

09/02/11 21:19:34 (fd:10) (pid:16817) slot1: Schedd addr =<10.20.40.16:44514>
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: Alive interval = 300
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: Received ClaimId from schedd
(<10.10.16.75:45488>#1314991032#1#...)
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: Match requesting resources:
cpus=4 memory=1500 disk=1%
09/02/11 21:19:34 (fd:10) (pid:16817) SLOT_TYPE_-1_PARTITIONABLE is undefined,
using default value of False
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: LoadQueue: Adding 1 entries of
value 0.000000
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: LoadQueue: 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: LoadQueue: Size: 60  Avg value:
0.00  Share of system load: 0.00
09/02/11 21:19:34 (fd:10) (pid:16817) 470270688 kbytes available for
"/local/condor.n1675/execute"
09/02/11 21:19:34 (fd:10) (pid:16817) slot1: Total execute space: 470265568
09/02/11 21:19:34 (fd:10) (pid:16817) slot1_1: New machine resource of type -1
allocated
09/02/11 21:19:34 (fd:10) (pid:16817) slot1_1: LoadQueue: Adding 1 entries of
value 0.000000
09/02/11 21:19:34 (fd:10) (pid:16817) slot1_1: LoadQueue: 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09/02/11 21:19:34 (fd:10) (pid:16817) slot1_1: LoadQueue: Size: 60  Avg value:
0.00  Share of system load: 0.00
09/02/11 21:19:34 (fd:10) (pid:16817) 470270688 kbytes available for
"/local/condor.n1675/execute"
09/02/11 21:19:34 (fd:10) (pid:16817) slot1_1: Total execute space: 470265568

The slot becomes matched but then the resourse type is '-1' and not 1 which I
would expect, and then torn down again..

config only has this:

root@n1675:/var/log/condor# condor_config_val -dump|grep SLOT
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpu=100%, ram=100%, swap=50%
SLOT_TYPE_1_PARTITIONABLE = True

Has anyone seen this also and can give me a hint where I left the true path?

Cheers

Carsten

The new dynamic slot isn't of type 1, it's of a new type that has "resources: cpus=4 memory=1500 disk=1%". All positive numbers could be configured types (SLOT_TYPE_#...) and type 0 has special meaning in the code (shhh!). A type is necessary (maybe it shouldn't be), type -1 was available and shouldn't be configurable. It's an implementation detail that unfortunately bubbles into logs at some debug levels. Nothing to worry about.

Sounds like you're still on the true path.

Best,


matt