[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Simulating while loops in DAGMan



Thanks for your comments, I wasn't aware of the Condor Perl API.
An added complication I didn't mention was that this loop may well be
called as an inner DAG of yet another DAG.  I think I can get round
this by creating a perl script that implements the while loop using
the Condor API and then calling that as a job itself:

DAG:
...
JOB while_loop  condor/while_loop.submit
...

while_loop.submit:
...
Executable = condor/scripts/iterate.pl
Arguments = condor/while_body.submit condor/scripts/loop_condition.pl
...

Thanks for your help,

Partha


On 01/11/05, Mark Silberstein <silbmarks@xxxxxxxxx> wrote:
> Let me put additional argument -> if you want to remove your jobs, submitted
> via DAG, it's only condor_rm <DAG cluster ID>. Your perl script, however, if
> killed, will leave running jobs in the queue. So it should be clever enough
> to add SIGTERM handler, and all the jobs should be added some identifier in
> the classad to be able to remove them easily. That's what DAGman does,
> actually. Thus - you're starting to rewrite quite a bit of the DAGman's
> functionality.
>  Said that, I completely agree, that if one has a complex flow, which is
> operated by a clever Condor-experienced user - perl script will certainly do
> the job. But if you have an automated unattended system run by  non-Condor
> users, I think that using DAG really pays off.
>
>
>
> On 11/1/05, Thomas Materna <materna@xxxxxxxxxxxxx> wrote:
> >
> > I see what you mean, but I think it's not that hard to have your perl
> script save its status and be able to restart where it left off, hence being
> fault tolerant.
> > It's even easier if, as I suspect, the idea is to have the program running
> until the output meets a certain criterium. When the perl script would
> restart, it will see that the criterium is not met yet and continue the loop
> until it is.
> >
> > Thomas
> >
> > Cyclotron Institute, Texas A&M university
> > ZIP 77843-3366
> > (979)-845-1411 ext. 258
> >
> >
> >
> > ________________________________
>  From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mark
> Silberstein
> > Sent: Tuesday, November 01, 2005 7:25
> > To: Condor-Users Mail List
> > Subject: Re: [Condor-users] Simulating while loops in DAGMan
> >
> >
> >
> > This is all true, as long as you don't care aboute fault tolerance.
> > If, for instance, your submission machine is restarted, your submission
> will not be resumed after it's back to life. This can be achieved via
> submitting your Perl script using Scheduler universe, which will restart the
> script once Condor is running again. However if you start your perl script
> again, the streightforward implementation would restart the loop from the
> beginning, which means that the script should not be that simplistic.
> >
> >  With DAG all this is handled automatically, which can be preferrable over
> the overhead of getting around the unfortunate limitations of this extremely
> useful tool.
> >
> >
> >
> > On 11/1/05, Thomas Materna <materna@xxxxxxxxxxxxx> wrote:
> > > Hi,
> > >
> > > You are using a perl script, I see, why don't you use the condor perl
> module
> > > and write a perl script with the submission enclosed in a while loop. It
> > > looks much more straightforward than trying to get around the
> limitations of
> > > DAGMan.
> > >
> > > Thomas
> > >
> > > Cyclotron Institute, Texas A&M university
> > > ZIP 77843-3366
> > > (979)-845-1411 ext. 258
> > >
> > > >
> > > > Hi,
> > > >
> > > > I'm trying to execute a job repeatedly until some convergence
> > > > criterion is met.  I realise I can't have any loops so I'm
> > > > trying to work around by using the RETRY function i.e. my DAG is:
> > > >
> > > > JOB iteration condor/while_body.submit
> > > > RETRY iteration 1000 UNLESS-EXIT 2
> > > > SCRIPT POST iteration
> condor/scripts/loop_condition.pl
> > > >
> > > > - while_body.submit is the body of the loop and I've made it
> > > > always exit with status 1.
> > > > - loop_condition.pl will exit with status 2 if we should
> > > > terminate the loop, 0 otherwise.
> > > > - I'm happy that the loop will stop after 1000 iterations no
> > > > matter what (hopefully it won't get that far!)
> > > >
> > > > If that can work, I'd like to replace the body with a DAG
> > > > i.e. an inner DAG produced by "condor_submit_dag -no_submit".
> > > > FYI, I've got a solution for for loops, I just need while loops!
> > > >
> > > > Thanks in advance,
> > > >
> > > > Partha
> > > >
> > > >
> > > > --
> > > > Partha Lal
> > > > PhD Student
> > > > CSTR, University of Edinburgh
> > > >
> > > > _______________________________________________
> > > > Condor-users mailing list
> > > > Condor-users@xxxxxxxxxxx
> > > >
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > > >
> > >
> > > _______________________________________________
> > > Condor-users mailing list
> > > Condor-users@xxxxxxxxxxx
> > > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > >
> >
> >
> > _______________________________________________
> > Condor-users mailing list
> > Condor-users@xxxxxxxxxxx
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> >
>
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>


--
Partha Lal
PhD Student
CSTR, University of Edinburgh