[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Notification of multiple job completion



You might also try condor_wait.

Best,


matt

On 11/02/2010 01:41 PM, Rob Matthews wrote:
Hi Kent,

Thanks for the info - this sounds exactly like what I need.

Rob

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of R. Kent Wenger
Sent: Friday, October 29, 2010 4:56 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Notification of multiple job completion

On Fri, 29 Oct 2010, Rob Matthews wrote:

I am new to Condor and am using it for Monte Carlo simulation. Each MC run
is independent and carried out by a given executable which produces a
results file, so I have a wrapper program which populates the inputs for
these and submits all the needed runs to the Condor queue. This all works
great except now I need some way of knowing when all the MC runs I
submitted
are complete so I can postprocess results (i.e. parse all the individual
results files and operate as needed).

Right now my wrapper code does this by polling the local directory every 5
seconds looking for the needed results files but this becomes inefficient
with large simulations. Is there a mechanism in Condor to possibly execute
a
program (like my postprocessing code) once all the jobs submitted to the
queue are compete?

You can do this by putting all of your MC jobs into a DAG with no
dependencies (see
http://www.cs.wisc.edu/condor/manual/v7.5/2_10DAGMan_Applications.html#SECTI
ON003106500000000000000
for info about DAGMan).

However, from your description, it sounds like you might benefit from
using DAGMan for more than just getting the notification when things are
done.  You could make the code that creates the input files a node in the
DAG, then have all of the actual MC jobs be dependent on that node, and
then have another node that does the postprocessing that's dependent on
all of the MC nodes.  This would get you the correct sequences of job
submissions without any coding on your part, and it would also enable you
to get rid of your wrapper code that does the actual submits.  Plus you
get all the other goodness of DAGMan, like options to re-try failed
nodes...

Kent Wenger
Condor Team
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/