[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Missing jobs in history of parallel universe runs.



Hi Sergey,

The person with the most parallel universe experience, Greg, is out this week.  I've CC'd him on this email: it might be worthwhile to reping this thread next Monday.

Apologies,

Brian

> On Aug 2, 2019, at 8:12 AM, Sergey A. Komissarov <sergey.komissarov@xxxxxxxxxxxxxx> wrote:
> 
> Hello,
> 
> During tests of parallel universe I noticed that after parallel job is finished history contains only jobs with ProcId == 0.
> 
> Minimal example from the test code:
> 
> sub = htcondor.Submit({
>    "executable": "/bin/ping",
>    "output": "test-$(ClusterId)-$(ProcId).out",
>    "universe": "parallel",
>    "machine_count": "1",
>    "+ParallelShutdownPolicy": classad.quote("WAIT_FOR_ALL"),
> })
> itemdata = [{"arguments": "-c3 127.0.0.1"}, {"arguments": "-c3 127.0.0.1"}]
> with schedd.transaction() as tnx:
>    sub.queue_with_itemdata(tnx, 1, iter(itemdata))
> 
> When tasks are finished (files "test-1-0.out" and "test-1-1.out" created) condor_history outputs this (and schedd.history() did the same):
> 
> root@73525f665f55:/# condor_history 
> ID     OWNER          SUBMITTED   RUN_TIME     ST COMPLETED   CMD            
>   1.0   user10101       8/2  12:51   0+00:00:05 C   8/2  12:51 /bin/ping -c3 127.0.0.1
> 
> But when I change universe to "vanilla", history displays all tasks:
> 
> root@e4a38b0ded32:/# condor_history 
> ID     OWNER          SUBMITTED   RUN_TIME     ST COMPLETED   CMD            
>   1.1   user10101       8/2  12:35   0+00:00:04 C   8/2  12:36 /bin/ping -c4 127.0.0.1
>   1.0   user10101       8/2  12:35   0+00:00:02 C   8/2  12:36 /bin/ping -c3 127.0.0.1
> 
> Could it be that there is something wrong with parallel jobs and multiple arguments?
> 
> We are using 8.9.2 at the moment:
> 
> root@73525f665f55:/# condor_version
> $CondorVersion: 8.9.2 Jun 04 2019 BuildID: Debian-8.9.2-1 PackageID: 8.9.2-1 Debian-8.9.2-1 $
> $CondorPlatform: X86_64-Ubuntu_18.04 $
> 
> ----------
> Sergey Komissarov
> Senior Software Developer
> DATADVANCE
> 
> This message may contain confidential information
> constituting a trade secret of DATADVANCE. Any distribution,
> use or copying of the information contained in this
> message is ineligible except under the internal
> regulations of DATADVANCE and may entail liability in
> accordance with the current legislation of the Russian
> Federation. If you have received this message by mistake
> please immediately inform me of it. Thank you!
> 
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/