[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs rejected "because of their own requirements"



Found the problem(s).

I never did get the "condor_q -better-analyze 22 -reverse fqdn" syntax to work correctly, but replacing it with "condor_q -better-analyze:reverse 22" worked fine. It then told me that the negotiator was matching things just fine.

Seems firewalld was blocking the condor port. The fedora zone description is pretty complicated, and it seems that adding the condor port to the "public" zone doesn't suffice. I'll get it worked out eventually, but condor was working fine once I disabled the firewall.

Regards,
Grant

On Fri, Sep 11, 2020 at 1:22 AM <christoph.beyer@xxxxxxx> wrote:
Hi,

maybe you need the fqdn for the reverse option:

condor_q -better-analyze 2.0 -reverse -machine <fqdn>

If that does not work try slot<number>@<fqdn> ...

You can always add a memory request to your job submit file using:Â request_memory = <quantity> (in MB)


Best
christoph

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: g2boojum@xxxxxxxxx
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Freitag, 11. September 2020 01:25:02
Betreff: [HTCondor-users] Jobs rejected "because of their own requirements"

I just set up a one-machine cluster on a Fedora workstation, using the default package (8.8.10), and this is my first time setting up a condor cluster using roles. I followed the "quick start" guide in the administration part of the manual, setting CentralManager, Exec, and
submit roles, along with password authentication, and everything looks good. It's an 18-core machine with hyperthreading, and 36 slots show up in condor_status. I submitted "sleep.sub" from https://research.cs.wisc.edu/htcondor/manual/quickstart.html, and the job remains Idle. Looks like it's being rejected by the negotiator because "36 reject your job because of their own requirements". That's new for me. I could use some help debugging that.

$ condor_q -better-analyze 2.0


-- Schedd: clh-8842.lab.core : <172.16.8.48:9618?...
The Requirements _expression_ for job 2.000 is

  (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory) &&
  (TARGET.HasFileTransfer)

Job 2.000 defines the following attributes:

  DiskUsage = 1
  ImageSize = 1
  RequestDisk = DiskUsage
  RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(ImageSize + 1023) / 1024)

The Requirements _expression_ for job 2.000 reduces to these conditions:

    ÂSlots
Step  ÂMatched ÂCondition
----- Â-------- Â---------
[0] Â Â Â Â Â36 ÂTARGET.Arch == "X86_64"
[1] Â Â Â Â Â36 ÂTARGET.OpSys == "LINUX"
[3] Â Â Â Â Â36 ÂTARGET.Disk >= RequestDisk
[5] Â Â Â Â Â36 ÂTARGET.Memory >= RequestMemory
[7] Â Â Â Â Â36 ÂTARGET.HasFileTransfer

No successful match recorded.
Last failed match: Thu Sep 10 18:03:28 2020

Reason for last match failure: no match found

002.000: ÂRun analysis summary ignoring user priority. Of 36 machines,
   0 are rejected by your job's requirements
  Â36 reject your job because of their own requirements
   0 match and are already running your jobs
   0 match but are serving other users
   0 are able to run your job

WARNING: ÂBe advised:
 ÂJob did not match any machines's constraints
 ÂTo see why, pick a machine that you think should match and add
  Â-reverse -machine <name>
 Âto your query.

For what it's worth, adding "-reverse -machine clh-8842.core.lab" to the query didn't return anything useful.

I'm guessing the problem might be the "undefined" in the RequestMemory attribute, but I'm not sure, and I'm not sure why it's undefined.


Thanks,
Grant
--
Grant Goodyear   Â
web: http://www.grantgoodyear.org Â
e-mail: grant@xxxxxxxxxxxxxxxxx

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Grant Goodyear   Â
web: http://www.grantgoodyear.org Â
e-mail: grant@xxxxxxxxxxxxxxxxx