[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] Calling condor_submit_dag recursively
- Date: Thu, 05 Feb 2004 08:04:31 +0000
- From: Mark Calleja <mcal00@xxxxxxxxxxxxx>
- Subject: Re: [condor-users] Calling condor_submit_dag recursively
Thanks for your reply.
Alain Roy wrote:
I'm trying to call condor_submit_dag recursively from within a DAGMan
POST script, but without much luck.
I would think that would work, but I'm not aware that we've tried it.
But it makes me wonder: why do you want to submit a DAG from a POST
script? Shouldn't it be another node in the DAG?
The purpose of a POST script is to do simple post-processing and to
decide if the job that is in the node succeeded or failed. In this
case, your DAG node will be considered to have succeeded if
condor_submit_dag returns 0. Is that the semantics you want?
The point is that I don't know how many nodes I'll need, so recursion
via the POST files seemed ideal to me.
As it turned out, I believe it was a subtle (to me, anyway) file
contention problem. The new dag job would try to open and write to its
dag log files while the old one was still using them. What solved the
problem was forking off a new process from the post file and sleeping
for a few seconds before kicking off the new job. I must admit, it took
me quite a while before I stumbled on that!
The initial submission works fine, but when the POST script tries to
kick off another dag (either by passing it to a "system" call, or by
forking and exec'ing a new process), nothing happens. The job and
dagman log files report no errors.
Did you get any output from condor_submit_dag? Did it return an error
It's possible that the environment was set up so that it couldn't find
condor_submit_dag, and it just failed. Could that be what happened?
Condor Support Information:
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>