[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Idle New Machines After Adding to HTCondor Server: Possible Causes?



Hi Thomas,

thank you for you help.

Here goes the output of the command:


---
slot120@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 3195 16+14:09:03

slot121@xxxxxxxxxxxx LINUX      X86_64 Unclaimed Idle      0.000 3195 16+18:21:45
----
[root@ce ~]# condor_q –better-analyze:reverse –machine slot121@xxxxxxxxxxxx 
 
 
-- Schedd: ce.xxx : <xxx.xxx.xxx.xxx:9203?... @ 08/24/23 07:19:33
OWNER BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS
 
Total for query: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
Total for all users: 1460 jobs; 0 completed, 0 removed, 0 idle, 1460 running, 0 held, 0 suspended
============================



Here the output of a machine Claimed/Busy:
------

slot120@xxxxxxxxxxxx LINUX      X86_64 Claimed   Busy      0.040 3195  0+03:25:54

slot121@xxxxxxxxxxxx LINUX      X86_64 Claimed   Busy      1.000 3195  0+02:47:09

---

[root@ce ~]# condor_q –better-analyze:reverse –machine slot121@xxxxxxxxxxxx

-- Schedd: ce.xxx : <xxx.xxx.xxx.xxx:9203?... @ 08/24/23 07:13:15
OWNER BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS
 
Total for query: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
Total for all users: 1458 jobs; 1 completed, 0 removed, 0 idle, 1457 running, 0 held, 0 suspended
================================

Regards,

Eraldo Jr

 


Quoting Thomas Hartmann <thomas.hartmann@xxxxxxx>:

Hi Eraldo,

can you try a reverse analyze and check, what jobs would in principle match for a given slot/machine? E.g.,

condor_q –better-analyze:reverse –machine slot1@xxxxxxxxxxx

That should give you a list of jobs, that would in principle be able to be brokered to that machine, and what requirements the slot has. Just in case, there is a machine requirement somewhat off, that causes jobs not to match?

Cheers,
Thomas


On 23/08/2023 17.37, ejunior@xxxxxxx wrote:

Hi Thomas,

yes, the nodes apear there, but idle:

---
slot1@xxxxxxxxxxx  LINUX      X86_64 Unclaimed Idle      0.000 3195   7+16:10:09

slot2@xxxxxxxxxxx   LINUX      X86_64 Unclaimed Idle      0.000 3195   7+16:54:31
slot3@xxxxxxxxxxx   LINUX      X86_64 Unclaimed Idle      0.000 3195   7+16:38:18
.
.
.
slot126@xxxxxxxxxxx   LINUX      X86_64 Unclaimed Idle      0.000 3195   7+17:58:06

----

This workernode has 126 slots, all idle.

Regards,

Eraldo

* /Date/: Fri, 18 Aug 2023 16:48:37 +0200
* /From/: Thomas Hartmann <thomas.hartmann@xxxxxxx
   <mailto:thomas.hartmann@xxxxxxxxxxxxx>>
* /Subject/: Re: [HTCondor-users] Idle New Machines After Adding to
   HTCondor Server: Possible Causes?

------------------------------------------------------------------------

Hi Eraldo,

quick question, but your new nodes show up with their slots in
  condor_status
or?
Do you have statis slots or partitionable slots configured?

Cheers,
  Thomas


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:https://lists.cs.wisc.edu/archive/htcondor-users/