[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Missing jobs in history of parallel universe runs.



Hello Greg,

My initial intent was to monitor single tasks inside a parallel job. But now I guess it is impossible.
Even if single task AdClass may be retrieved when parallel job is running all attributes needed for
monitoring are meaningless. JobStatus, CompletionDate, JobStartDate and ExitCode have the same values for
all tasks in the parallel job and will change only when all tasks are completed (with option WAIT_FOR_ALL). 
And there is not way to tell that the task inside parallel job failed. We can monitor ExitCode of the first task.
Please correct me if I missing something.

----------
Sergey Komissarov
Senior Software Developer
DATADVANCE

This message may contain confidential information
constituting a trade secret of DATADVANCE. Any distribution,
use or copying of the information contained in this
message is ineligible except under the internal
regulations of DATADVANCE and may entail liability in
accordance with the current legislation of the Russian
Federation. If you have received this message by mistake
please immediately inform me of it. Thank you!



----- Original Message -----
From: "Greg Thain" <gthain@xxxxxxxxxxx>
To: htcondor-users@xxxxxxxxxxx
Sent: Thursday, August 8, 2019 5:47:31 PM
Subject: Re: [HTCondor-users] Missing jobs in history of parallel universe	runs.

On 8/2/19 8:12 AM, Sergey A. Komissarov wrote:
> Hello,
>
> During tests of parallel universe I noticed that after parallel job is finished history contains only jobs with ProcId == 0.
>
>
This is mostly on purpose, but could be considered a design flaw -- A 
single parallel job with multiple procs is one single HTCondor job, and 
condor_rm'ing one of the procs will result in the removal of all related 
procs in the job, unlike other universes.


-greg

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/