[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] audit for idle/held jobs (system management)



Thanks Todd, I can work from that :) It'll be a lot cleaner than what I was writing.

nomad

On Fri, Jan 5, 2024 at 7:57âAM Todd L Miller via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
> Instead I'd like to get a simple report if a job is idle or held for more
> than 7 days so I can follow up with the user. Before I go crazy writing
> scripts to pull apart the output of condor_q -held and condor_q -idle then
> email if anything is found I thought I'd ask here if someone has already
> solved this problem?

    I'm sure others on the list have solved similar problems, but I
personally don't have any solutions to share. I would observe that if you
don't care if the job is held or idle for your report, you could get both
with a single query of the schedd:

condor_q -const '(JobStatus == 5 || JobStatus == 1) && (time() - EnteredCurrentStatus) > (7 * 24 * 60 * 60)' -af ClusterID ProcID user

(Looking at the following URL for the JobStatus values:
https://urldefense.com/v3/__https://htcondor.readthedocs.io/en/latest/classad-attributes/job-classad-attributes.html*JobStatus__;Iw!!K-Hz7m0Vt54!ir68PKs2UbIe9hQPEPjQj18Bc48FiO2hG1BFANzBEwE0lBZWG3YNxrmH68OK6S6oK33cKMCYoIJdu7bUS-HWkScB$
)

> Is there perhaps even something built into HTCondor that I could
> leverage?

    Not that I'm aware of.