[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] dynamic slots



The condor_q -analyze output below shows that the job matches the slot, but it also shows 0 machines for all of the counters in the last clause, and 

No successful match recorded.
Last failed match: Fri Feb 23 14:38:52 2018   

That probably indicates that the slot doesn't match the job for some reason.  try running

condor_q -better:reverse 38720 -machine slot1@chopin

-tj

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Larry Martell
Sent: Friday, February 23, 2018 1:47 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] dynamic slots

I am trying to use dynamic slots as documented here:

http://research.cs.wisc.edu/htcondor/CondorWeek2012/presentations/thain-dynamic-slots.pdf

I have configured 1 slot thusly:

NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=75%
SLOT_TYPE_1 = mem=64000
SLOT_TYPE_1_PARTITIONABLE = true

I submit a job that requires 10G of memory and it does not run:

$ condor_q -better-analyze 38720


-- Schedd: bach.elucid.local : <192.168.10.2:9618?...
The Requirements expression for job 38720.000 is

    ( ( Memory >= 10000 ) ) && ( TARGET.Arch == "X86_64" ) &&
    ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) &&
    ( TARGET.Memory >= RequestMemory ) && ( TARGET.HasFileTransfer )

Job 38720.000 defines the following attributes:

    DiskUsage = 0
    ImageSize = 0
    RequestDisk = DiskUsage
    RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(
ImageSize + 1023 ) / 1024)

slot1@chopin has the following attributes:

    TARGET.Memory = 64000
    TARGET.Arch = "X86_64"
    TARGET.Disk = 90191948
    TARGET.HasFileTransfer = true
    TARGET.OpSys = "LINUX"

The Requirements expression for job 38720.000 reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]           1  Memory >= 10000
[1]           1  TARGET.Arch == "X86_64"
[3]           1  TARGET.OpSys == "LINUX"
[5]           1  TARGET.Disk >= RequestDisk
[7]           1  TARGET.Memory >= RequestMemory
[9]           1  TARGET.HasFileTransfer

No successful match recorded.
Last failed match: Fri Feb 23 14:38:52 2018

Reason for last match failure: no match found

38720.000:  Run analysis summary ignoring user priority.  Of 1 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match and are already running your jobs
      0 match but are serving other users
      0 are available to run your job

Can anyone tell me why it's not running?
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/