[classad-users] Dependencies and repetitions in DAGMan


Date: Thu, 4 Jun 2009 22:22:55 -0400
From: Marc Tardif <marc@xxxxxxxxxxxxx>
Subject: [classad-users] Dependencies and repetitions in DAGMan
Hi folks,

I have been exploring grid application description languages from various
sources and I found DAGMan to be the most promissing. However, I have a
few use cases for which I would appreciate some assistance in expressing
with DAGMan and ClassAd:

1. In a DAG input file, it seems that the name of the submit description
   filenames given to jobs constitute a unique name when expressing
   dependencies. That was a mouthfull, so here's an example:

   # Filename: B.dag
   JOB A A.condor DONE
   JOB B B.condor
   PARENT A CHILD B

   So, my understanding is that job B will only run once all jobs
   described by A.condor are completed. For example, lets say the
   following submit files were enqueued:

   1. A.condor
   2. B.dag
   3. A.condor

   Then, would B.dag only run once #1 is completed or once all
   submits matching A.condor are completed or is there something
   I don't understand?

2. Is there a way to express either a submit description file or a DAG
   input file so that an executable is run on each node in a cluster
   only once? If not, must I enqueue a submit description file for
   each node with something like:

   requirements = other.hostname == 'foo'

   And so forth for each host. (Note that "hostname" probably isn't
   part of ClassAd, but I mean anything that uniquely identifies each
   node in a cluster)

3. Would it be possible to remove a resource provider (a machine) from
   a cluster but only once the current jobs have completed as well as
   all the other dependent jobs as defined by the pending DAG input
   files? For example, here's an example:

   # Filename: A.dag
   JOB A A.condor
   JOB B B.condor
   PARENT A CHILD B

   So, if a node is in the middle of running job A, I would like to be
   notified somehow when job B has completed. However, I don't necessarily
   want to hard code that I'm waiting for job B to complete, I would rather
   express abstractly: tell me when the current jobs and dependents have
   completed.

Thanks,
Marc

[← Prev in Thread] Current Thread [Next in Thread→]