[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Notification of multiple job completion



Hi Kent,

Thanks for the info - this sounds exactly like what I need.

Rob

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of R. Kent Wenger
Sent: Friday, October 29, 2010 4:56 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Notification of multiple job completion

On Fri, 29 Oct 2010, Rob Matthews wrote:

> I am new to Condor and am using it for Monte Carlo simulation. Each MC run
> is independent and carried out by a given executable which produces a
> results file, so I have a wrapper program which populates the inputs for
> these and submits all the needed runs to the Condor queue. This all works
> great except now I need some way of knowing when all the MC runs I
submitted
> are complete so I can postprocess results (i.e. parse all the individual
> results files and operate as needed).
>
> Right now my wrapper code does this by polling the local directory every 5
> seconds looking for the needed results files but this becomes inefficient
> with large simulations. Is there a mechanism in Condor to possibly execute
a
> program (like my postprocessing code) once all the jobs submitted to the
> queue are compete?

You can do this by putting all of your MC jobs into a DAG with no 
dependencies (see 
http://www.cs.wisc.edu/condor/manual/v7.5/2_10DAGMan_Applications.html#SECTI
ON003106500000000000000 
for info about DAGMan).

However, from your description, it sounds like you might benefit from 
using DAGMan for more than just getting the notification when things are 
done.  You could make the code that creates the input files a node in the 
DAG, then have all of the actual MC jobs be dependent on that node, and 
then have another node that does the postprocessing that's dependent on 
all of the MC nodes.  This would get you the correct sequences of job 
submissions without any coding on your part, and it would also enable you 
to get rid of your wrapper code that does the actual submits.  Plus you 
get all the other goodness of DAGMan, like options to re-try failed 
nodes...

Kent Wenger
Condor Team
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/