[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] It looks like ParallelShutdownPolicy is not working.



I use the python HTCondor api with the simple parallel task:

with schedd.transaction() as shedd_transaction:
    sub = htcondor.Submit(
      {
        "universe": "parallel",
        "executable": "/bin/ping",
        "machine_count": "1",
        "request_cpus": "0",
        "error": ".test.err",
        "output": ".test.out",
        "log": ".test.log",
        "should_transfer_files": "NO",
        "transfer_executable": "False",
        "run_as_owner": "True",
        "+Owner": f'"user"',
        "+ParallelShutdownPolicy": "WAIT_FOR_ALL",
      }
    )
    res = sub.queue_with_itemdata(
      shedd_transaction,
      1,
      iter(
        [
          {
            "arguments": "-c3 127.0.0.1",
            "initial_dir": "/tmp/tmp1",
          },
          {
            "arguments": "-c10 127.0.0.1",
            "initial_dir": "/tmp/tmp2",
          },
        ]
      ),
    )


And the second job with the x.1 id ends prematurely! Can anyone please tell me why this is happening?

I added details and images with diagnostics on stackoverflow:
https://stackoverflow.com/questions/57259887/htcondor-how-to-wait-until-all-jobs-are-completed-in-the-parallel-universe-par

--
Sincerely yours,
Ivan Ergunov                         mailto:hozblok@xxxxxxxxx