[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] htcondor python api



thanksÂ

On Thu, Apr 22, 2021 at 10:48 AM Jason Patton via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
htcondor.JobEventLog(path_to_logfile) is indeed what you want to use here:
https://htcondor.readthedocs.io/en/latest/apis/python-bindings/api/htcondor.html#htcondor.JobEventLog

In case it's not clear from the condor_watch_q code, what you want to do is put an inner "for event in jel.events(stop_after=1)" loop inside an outer loop, then break from the outer loop after accumulating the number of htcondor.JobEventType.JOB_TERMINATED events that you expect to see (or after some timeout period).

The condor_watch_q code can be a little heavy, so I whipped up a simple function that I think accomplishes what you want to do:

import htcondor
import time

def wait_for_job(logfile, num_jobs, timeout=None):
  start = time.time()
  completed = 0
  jel = htcondor.JobEventLog(my_log_file)
  while True:
    for event in jel.events(stop_after=0):
      completed += int(event.type == htcondor.JobEventType.JOB_TERMINATED)
      if event.type in { Â# catch some non-termination events that halt job progress    Â
          htcondor.JobEventType.JOB_ABORTED,
          htcondor.JobEventType.JOB_HELD,
          htcondor.JobEventType.CLUSTER_REMOVE,
        }:
        raise RuntimeError("A job was aborted, held, or removed")
    if completed >= my_job_count: Â# jobs completed                      Â
      break
    if timeout is not None and (time.time() - start) > timeout:
      raise RuntimeError("Timed out waiting for job to complete")
    time.sleep(1) Â# wait one second before polling again


This is certainly not perfect by any means (there are other events you might want to raise an exception on, or maybe you don't want to raise an exception at all), but hopefully it gets the idea across.

Jason Patton

On Thu, Apr 22, 2021 at 8:53 AM <htcondor-users@xxxxxxxxxxx> wrote:
Hi,

Unfortunately there isn't a native way in the Python bindings to wait on a job to complete. You should check out the JobStateTracker class from condor_watch_q for examples of how to parse the event logs: https://github.com/htcondor/htcondor/blob/master/src/condor_scripts/condor_watch_q#L831

- Brian

On 4/22/21 7:58 AM, rmorgan466@xxxxxxxxx wrote:
using the python API, i can submit a job but is there a way to wait until the job is completed?

Ther eisÂhtcondor.JobEvengLog("logfile") but I am not sure how to really use that to wait until a task is completed.

--
--- Get your facts first, then you can distort them as you please.--

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
--- Get your facts first, then you can distort them as you please.--