Re: [HTCondor-users] Jobs rejected "because of their own requirements"


maybe you need the fqdn for the reverse option:

condor_q -better-analyze 2.0 -reverse -machine <fqdn>

If that does not work try slot<number>@<fqdn> ...

You can always add a memory request to your job submit file using:  request_memory = <quantity> (in MB)


I just set up a one-machine cluster on a Fedora workstation, using the default package (8.8.10), and this is my first time setting up a condor cluster using roles. I followed the "quick start" guide in the administration part of the manual, setting CentralManager, Exec, and
submit roles, along with password authentication, and everything looks good. It's an 18-core machine with hyperthreading, and 36 slots show up in condor_status. I submitted "sleep.sub" from https://research.cs.wisc.edu/htcondor/manual/quickstart.html, and the job remains Idle. Looks like it's being rejected by the negotiator because "36 reject your job because of their own requirements". That's new for me. I could use some help debugging that.

$ condor_q -better-analyze 2.0

-- Schedd: clh-8842.lab.core : <
The Requirements _expression_ for job 2.000 is

    (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) &&

Job 2.000 defines the following attributes:

    DiskUsage = 1
    ImageSize = 1
    RequestDisk = DiskUsage
    RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(ImageSize + 1023) / 1024)

The Requirements _expression_ for job 2.000 reduces to these conditions:

Step    Matched  Condition
-----  --------  ---------
[0]          36  TARGET.Arch == "X86_64"
[1]          36  TARGET.OpSys == "LINUX"
[3]          36  TARGET.Disk >= RequestDisk
[5]          36  TARGET.Memory >= RequestMemory
[7]          36  TARGET.HasFileTransfer

No successful match recorded.
Last failed match: Thu Sep 10 18:03:28 2020

Reason for last match failure: no match found

002.000:  Run analysis summary ignoring user priority.  Of 36 machines,
      0 are rejected by your job's requirements
     36 reject your job because of their own requirements
      0 match and are already running your jobs
      0 match but are serving other users
      0 are able to run your job

WARNING:  Be advised:
   Job did not match any machines's constraints
   To see why, pick a machine that you think should match and add
     -reverse -machine <name>
   to your query.

For what it's worth, adding "-reverse -machine clh-8842.core.lab" to the query didn't return anything useful.

I'm guessing the problem might be the "undefined" in the RequestMemory attribute, but I'm not sure, and I'm not sure why it's undefined.

