[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Determine when all jobs in a cluster have finished?



On Wed, 30 Jan 2013, Brian Pipa wrote:

I'd really like the whole thing to be self-contained in one DAG like:
###
Job QueryDB querydb.job
Job Workers workers.job
Job PostProcess postprocess.job
PARENT QueryDB CHILD Workers
PARENT Workers CHILD PostProcess
###

since that seems much simpler and self-contained but I don't think
that's doable since the results of the QueryDB job determines the data
and number of worker jobs I'll need. For example, one run of QueryDB
could get 2 million results and I would create 2000 data files
containing 1000 entries each and those would be consumed by 2000
worker jobs. Another run might create only 1 data file and 1 worker. I
can't think of a way to get this all working within one DAG file.
Right now, I pass in to each worker an argument of the datafile to
process.

Actually, you *can* do this. The "trick" is that the workers.job file would not exist at the time you submit the overall DAG. The workers.job file would be written by the QueryDB job (or maybe a post script to the QueryDB job). So the workers.job file could be customized to create however many workers you needed based on the results of the query.

If you want to use a post script, syntax is like this:

  SCRIPT POST <job> <script> [arguments]

So you could write something like a perl script that figures out how many workers you want, and writes the workers.job file accordingly.

Kent Wenger
CHTC Team