[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Determine when all jobs in a cluster have finished?



On Wed, Jan 30, 2013 at 9:42 AM, Brian Candler <B.Candler@xxxxxxxxx> wrote:
> On Wed, Jan 30, 2013 at 09:21:46AM -0500, Brian Pipa wrote:
>> I'd really like the whole thing to be self-contained in one DAG like:
>> ###
>> Job QueryDB querydb.job
>> Job Workers workers.job
>> Job PostProcess postprocess.job
>> PARENT QueryDB CHILD Workers
>> PARENT Workers CHILD PostProcess
>> ###
>>
>> since that seems much simpler and self-contained but I don't think
>> that's doable since the results of the QueryDB job determines the data
>> and number of worker jobs I'll need.
>
> I thought I read something along those lines a while back. In principle I
> think it is doable, because workers.job is not processed until the time that
> job is actually submitted.
>
> I don't know the cleanest way of making this dynamic, but you could try
> simply overwriting the workers.job file from the querydb job.
>
> Maybe what I am thinking of is this:
> http://research.cs.wisc.edu/htcondor/manual/v7.8/2_10DAGMan_Applications.html#SECTION003107800000000000000
> It says explicitly that a DAG may modify a sub-DAG before it is submitted.
>
Brilliant! Brian, I owe you a beer (or beverage of your choice). I
didn't even think of that. I can modify both the workers.job and the
postprocess.job from the dbquery job... that would solve all my
problems. Except... I would have to make sure I never have 2 DBQuery
jobs running at the same time since the .job files would get tramplesy
would get in the way of each other... unless I could pass a parameter
in so that it submits something like worker$(CLUSTER).($Process).job
(and same thing for postprocess). More googling (or posts to this
list).  Maybe it's time to read the entire DAGMan manual page:
http://research.cs.wisc.edu/htcondor/manual/v7.8/2_10DAGMan_Applications.html

Thanks!
Brian