[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Dynamic DAGs



On Wed, Jun 12, 2013 at 12:40:56PM -0700, Dmitry Grudzinskiy wrote:
> Hello,
> 
> My name is Dmitry and our company is currently looking at different
> solutions for our cluster. We are using Torque at the moment and not very
> unhappy with it.
> 
> HTCondor seems very attractive specifically because of DAG functionality
> that we would utilize a lot.
> 
> But the workflows that we have may require dynamic DAGs meaning that the
> graph may change it's structure during runtime and there is no way for us
> to know it before we start.
> 
> For example node A generates number of files which is unknown before we
> start. After the job is done we process these files in parallel and the
> number of jobs should be equal to the number of files - each job processing
> one file. Then after this n jobs are done the DAG may continue and have
> other jobs to run.
> 

Dmitry:

You will probably want to combine a post script of node A with a nested
dag (subdag external) for the child node.

http://research.cs.wisc.edu/htcondor/manual/v8.0/2_10DAGMan_Applications.html

has sufficient detail for you to accomplish this.

> What would be the best way to solve this?
> 
> Thank you,
> Dmitry

For example:

job A do-something
script post A figure-out-what-we-did
subdag external B figured-it-out.dag
job C next-step
parent A child B
parent B child C

figure-out-what-we-did:
===================================>8===================================
#!/bin/bash
j=0
for i in A-*.out; do
process_output $i $j>>figured-it-out.dag
done
===================================8<===================================

Nathan Panike