[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_submit does not respect multiple universe commands in the same submit file



There is not currently a canonical way to get a new cluster in a single submit file.   Changing the executable currently
works, and we know that people are using that as a method.   As a part of getting rid of that behavior we will
need to add a new method that is more explicit. 

If you use the python bindings,  then using a second submit object within a transaction will result in a new cluster id.
We have no plan to change that. 

We are discussing how to go about allowing multiple submit files to be passed as arguments to condor_submit,
that will likely become the supported way to get more than one cluster.     I could also see adding an explicit
command into the submit file that allocates a new cluster, although this would probably also clear out all
of the submit keywords, so it wouldn't really do exactly what you want here.

I have added a comment to #7331, and also some preliminary thoughts in #7336

-tj

-----Original Message-----
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Stuart Anderson
Sent: Monday, October 21, 2019 5:41 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Singer, Leo P. (GSFC-6610) <leo.p.singer@xxxxxxxx>
Subject: Re: [HTCondor-users] condor_submit does not respect multiple universe commands in the same submit file

tj,
	Is there a canonical way to trigger a new cluster id? And if so, is that considered a reasonable way to switch universes in the same submit file, or should that be completely avoided?

P.S. If you have time please also update ticket 7331.

Thanks.

> On Oct 21, 2019, at 2:49 PM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
> 
> This is expected behavior.  Although we should do a better job of reporting this as an error. 
> 
> The universe statement changes the way the submit file parser interprets the submit file itself.  it is not simply another attribute of the job. 
> Because of this, the parser only ever looks at the universe statement once after it allocates a cluster id from the schedd.   For nearly 
> all submit files, this means once for the whole submission. 
> 
> The exception, *for now* is that when you change the executable  the submit parser allocates a new cluster id from the schedd
> and that has the side effect of having the latest universe definition take effect for the new cluster.  This is arguably a bug. 
> 
> But in any case, you should not depend on this behavior!!  It will very likely stop working in the future.  
> 
> condor_submit allocates a second cluster id when the executable changes because that is required for the (now obsolete) standard universe.   In the 8.9 series, standard universe is gone, and we are in the process of tearing out all of the special case code to handle it.   At some point this will include allocating a new cluster id when the executable changes. 
> 
> -tj
> 
> -----Original Message-----
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Singer, Leo P. (GSFC-6610) via HTCondor-users
> Sent: Friday, October 18, 2019 1:38 PM
> To: htcondor-users@xxxxxxxxxxx
> Cc: Singer, Leo P. (GSFC-6610) <leo.p.singer@xxxxxxxx>
> Subject: [HTCondor-users] condor_submit does not respect multiple universe commands in the same submit file
> 
> Hi,
> 
> I am trying to put multiple jobs in different universes in the same condor_submit file. I have one local universe job and one vanilla universe job. It looks like condor_submit is ignoring the second universe command and running both jobs in the local universe. See example below. This is with HTCondor 8.8.5 on the LIGO-Caltech computing cluster.
> 
> Thanks,
> Leo
> 
> Dr. Leo P. Singer
> Research Astrophysicist
> Astroparticle Physics Laboratory
> NASA Goddard Space Flight Center
> 
> 
> 
> ldas-pcdev1:~ lsinger$ cat foo.sub 
> accounting_group = ligo.dev.o3.cbc.pe.bayestar
> executable = /usr/bin/env
> arguments = "hostname --fqdn"
> 
> universe = local
> output = foo-local.out
> error = foo-local.err
> log = foo-local.log
> queue
> 
> universe = vanilla
> output = foo-vanilla.out
> error = foo-vanilla.err
> log = foo-vanilla.log
> queue
> 
> 
> ldas-pcdev1:~ lsinger$ cat foo-local.out 
> ldas-pcdev1.ligo.caltech.edu
> 
> 
> ldas-pcdev1:~ lsinger$ cat foo-vanilla.out 
> ldas-pcdev1.ligo.caltech.edu
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

--
Stuart Anderson
sba@xxxxxxxxxxx




_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/