[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] DAGman question



On Mon, Aug 09, 2004 at 10:32:17AM -0500, Scott Koranda wrote:
> Hi,
> 
> We are using Condor 6.6.1 (will be upgrading very soon to
> 6.6.6...).
> 
> If I submit an ordinary job (not part of a DAG) to Condor with
> 
>         getenv = True 
> 
> then my environment is picked up and passed along to my
> executable, including X509_USER_PROXY.
> 
> If, however, the job is part of a DAG then for whatever reason
> X509_USER_PROXY is NOT being passed to my job, and my job is
> not able to authenticate properly to various Grid services.
> 
> So I have some questions:
> 
> 1) Is this behaviour known and understood?
> 

Yes. 

> 2) Is it "fixed" in 6.6.6?
> 

No.

> 3) When the job is run as part of a DAG just what environment
> is it inheriting?
> 

It's a touch hard to follow, but basically, 'getenv = true' in a job
submitted by DAGMan gets the environment of the 'condor_submit' that
DAGMan has. THe environment that DAGMan has is a blend of the environment
of the condor_schedd (and the condor_schedd gets what the condor_master was 
started with) and whatever the submit file for DAGMan says it should have
(that submit file is usually written automatically by condor_submit_dag).
You have to be careful, because I don't think that condor_submit_dag 
puts a 'getenv = true' into the submit file it writes and submits, so
you have to probably go through some hoops to something into 
the environment of DAGMan (probably by using 
'condor_submit_dag -a 'enviornment=MY_ENV_VARIABLE=/some/path;' )

condor_master
     |
condor_schedd <-------+
     | \              |
     |  +-----------------------+
condor_dagman         |         |
     |                |         | 
condor_submit job_a---+       gridmanager for job_a
                                |
                             gahp_server (Globus GRAM libraries)
                                |
                        [Globus protocols to the outside world]    


> 4) condor_dagman appears to have no help information:
> 

condor_dagman is a weird - it's not really a user tool (like condor_submit
or condor_q), but more of a daemon (just one that users can start on-demand). 
As such, it's built on the Condor daemon_core libraies, and doesn't have a 
help screen (because it's never really meant to be run direct from a command
line). 

The problem is that the Condor daemon libraries remove X509_USER_PROXY from
their environment automatically, because the GSI libraries that Condor uses
for security will use it, and the common case there is for the X509_USER_PROXY
variable to be inadvertently set. So even if you did a 
condor_submit_dag -a 'enviornment=X509_USER_PROXY=/some/path;' and 
got it into the environment, DAGMan would remove it and not pass it on
to the environment it sets up for condor_submit 

To work around it, either use x509userproxy in the submit files of the
jobs you're submitting (always the safest) or use
condor_submit_dag -a 'enviornment=_CONDOR_GSI_DAEMON_PROXY=/some/path;'. The condor
daemons use that setting for their security, and as a side effect, the
environment that DAGMan sets up for 'condor_submit' will have X509_USER_PROXY
set to whatever GSI_DAEMON_PROXY is set to. (We know, it's not ideal. But it will
work today)

We've debated in the past of how to make the this less confusing, and let people
easily set X509_USER_PROXY but still not trip up the daemons, and we still haven't 
come up with a consensus internally (I'm pushing for a way to influence the 
environment that DAGMan sets up for 'condor_submit', personally). I suspect we're 
just going to take out the X509_USER_PROXY removal bit.

-Erik