[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problems with DAG



On Fri, 28 Mar 2008, Alexander Dietz wrote:

I am running into trouble while running a DAG with DAGs with condor version
7.0.1. When the DAG's are created one of the sub-DAGs is called "
datafind.GRB060429_DATAFIND.dag". From this file another file called "
datafind.GRB060429_DATAFIND.dag.condor.sub" is created by calling
"condor_submit_dag" (so it is written in the latter file). This file (i.e. "
datafind.GRB060429_DATAFIND.dag.condor.sub" which is attached) contains
several arguments (which are arguments to the "condor_dagman" command), but
for unknown reasons there are arguments not contained in the documentation
of "condor_dagman", like the argument "AutoRescue".
Since this file is created by "condor_dagman" with obvious no arguments, how
do these needless arguments get in? I cannot find them in the
condor-configuration files nor in the environment variables anywhere. Is
there some other way that arguments could get in?

Is it possible that you're running a 7.1.0 pre-release condor_submit_dag binary? This is the only explanation I can think of. There are pretty major changes in how rescue DAGs work in 7.1.0, and the -Autorescue flag is part of that.

You can find the version of your condor_submit_dag binary by doing the following:

    strings `which condor_submit_dag` |grep CondorVersion:

(assuming you're on some flavor of Unix or Linux; I'm not sure what the equivalent is on Windows).

Does the datafind.GRB060429_DATAFIND.dag.condor.sub file work okay for DAGMan itself, or does it cause problems? If it causes problems, you may have a version mismatch between condor_submit_dag and condor_dagman.

(One general note here: the DAGMan version doesn't have to match the version of the rest of the Condor installation, but the versions of
condor_submit_dag and condor_dagman should always match each other.)

Kent Wenger
Condor Team