[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Simulating while loops in DAGMan



Let me put additional argument -> if you want to remove your jobs, submitted via DAG, it's only condor_rm <DAG cluster ID>. Your perl script, however, if killed, will leave running jobs in the queue. So it should be clever enough to add SIGTERM handler, and all the jobs should be added some identifier in the classad to be able to remove them easily. That's what DAGman does, actually. Thus - you're starting to rewrite quite a bit of the DAGman's functionality.
Said that, I completely agree, that if one has a complex flow, which is operated by a clever Condor-experienced user - perl script will certainly do the job. But if you have an automated unattended system run by  non-Condor users, I think that using DAG really pays off.


On 11/1/05, Thomas Materna <materna@xxxxxxxxxxxxx> wrote:
I see what you mean, but I think it's not that hard to have your perl script save its status and be able to restart where it left off, hence being fault tolerant.
It's even easier if, as I suspect, the idea is to have the program running until the output meets a certain criterium. When the perl script would restart, it will see that the criterium is not met yet and continue the loop until it is.
 
Thomas

Cyclotron Institute, Texas A&M university
ZIP 77843-3366
(979)-845-1411 ext. 258

 


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mark Silberstein
Sent: Tuesday, November 01, 2005 7:25
To: Condor-Users Mail List
Subject: Re: [Condor-users] Simulating while loops in DAGMan

This is all true, as long as you don't care aboute fault tolerance.
If, for instance, your submission machine is restarted, your submission will not be resumed after it's back to life. This can be achieved via submitting your Perl script using Scheduler universe, which will restart the script once Condor is running again. However if you start your perl script again, the streightforward implementation would restart the loop from the beginning, which means that the script should not be that simplistic.

 With DAG all this is handled automatically, which can be preferrable over the overhead of getting around the unfortunate limitations of this extremely useful tool.


On 11/1/05, Thomas Materna <materna@xxxxxxxxxxxxx> wrote:
Hi,

You are using a perl script, I see, why don't you use the condor perl module
and write a perl script with the submission enclosed in a while loop. It
looks much more straightforward than trying to get around the limitations of
DAGMan.

Thomas

Cyclotron Institute, Texas A&M university
ZIP 77843-3366
(979)-845-1411 ext. 258

>
> Hi,
>
> I'm trying to execute a job repeatedly until some convergence
> criterion is met.  I realise I can't have any loops so I'm
> trying to work around by using the RETRY function i.e. my DAG is:
>
> JOB iteration condor/while_body.submit
> RETRY iteration 1000 UNLESS-EXIT 2
> SCRIPT POST iteration condor/scripts/loop_condition.pl
>
> - while_body.submit is the body of the loop and I've made it
> always exit with status 1.
> - loop_condition.pl will exit with status 2 if we should
> terminate the loop, 0 otherwise.
> - I'm happy that the loop will stop after 1000 iterations no
> matter what (hopefully it won't get that far!)
>
> If that can work, I'd like to replace the body with a DAG
> i.e. an inner DAG produced by "condor_submit_dag -no_submit".
> FYI, I've got a solution for for loops, I just need while loops!
>
> Thanks in advance,
>
> Partha
>
>
> --
> Partha Lal
> PhD Student
> CSTR, University of Edinburgh
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users