[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Dynamic slots problem




Hi everyone,

I'm using ARC-ce with HTcondor, and I want to run the jobs in Docker
universe. Unfortunately the jobs d'ont work with  dynamic slots. I have
for the beginning 2 nodes with following configuration:
wn172:
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=100%

wn131:
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=100%
SLOT_TYPE_1_PARTITIONABLE = TRUE

Both have 8 cores and Docker:

condor_status -af Name Cpus
slot1@xxxxxxxxxxxxxx 8
wn172.nipne.ro 8

condor_status -af Name HasDocker
slot1@xxxxxxxxxxxxxx true
wn172.nipne.ro true

But the jobs are running only on wn172.The output of an idle job is:

 condor_q 4552 -better-analyze


-- Schedd: arc-htc.nipne.ro : <81.180.86.133:10722>
The Requirements expression for job 4552.000 is

    ( TARGET.HasDocker ) && ( TARGET.Cpus >= RequestCpus )

Job 4552.000 defines the following attributes:

    RequestCpus = 8

The Requirements expression for job 4552.000 reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]           2  TARGET.HasDocker
[1]           2  TARGET.Cpus >= RequestCpus

No successful match recorded.
Last failed match: Mon Oct 16 14:22:21 2017

Reason for last match failure: no match found

4552.000:  Run analysis summary ignoring user priority.  Of 2 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 are exhausted partitionable slots
      1 match and are already running your jobs
      0 match but are serving other users
      0 are available to run your job

If I'm adding also the "SLOT_TYPE_1_PARTITIONABLE = TRUE" to wn172 config,
the node it will not run anymore jobs and i will get the same message:
"are exhausted partitionable slots"

Anyone have any idea?

Thanks in advance,
Mihai