[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Problem running Grid jobs using Condor.
- Date: Thu, 16 Apr 2009 16:30:05 -0600
- From: Balamurali Ananthan <bala@xxxxxxxxxx>
- Subject: Re: [Condor-users] Problem running Grid jobs using Condor.
Found the solution:
It had nothing to do with dns resolution or running the job as nobody.
The problem was, even though the home dir (/home/research/bala) existed
in both the execute and submit machines, they were not cross mounted. So
existed in the submit machine but was not accessible from the execute
Tweaked the condor.pm such that the output files are produced in /tmp,
and the jobs ran to completion. So I am wondering why condor did not
create '/.globus/job/vulcan.txcorp.com/9128.1239817731/stdout'/ in the
execute machine even though the condor master in the execute machine was
started as root?
Balamurali Ananthan wrote:
I am trying to run a job in the condor system submitted through the
But the jobs are being held for this reason:
HoldReason = "Error from starter on slot1@xxxxxxxxxxxxxxxxxxx: Failed to
as standard output: No such file or directory (errno 2)"
Here is what I already did:
1. Started the execute machine's master daemon as root.
2. Set the UID_DOMAIN in the condor_config on the execute machine to
3. Set the TRUST_UID_DOMAIN = TRUE on the execute machine
4. The account with which the job is supposed to be run on the execute
machine is not in the /etc/passwd file. So the SOFT_UID_DOMAIN = TRUE is
set in the execute machine.
However, the execute machine (10.0.0.2) cannot do a dns lookup. So there
is no way the execute machine can DNS resolve 10.0.0.105 to
vulcan.txcorp.com which is the submit machine, although /etc/hosts can
be used to resolve 10.0.0.105 to vulcan.txcorp.com
1. Does the execute machine depends only on dns to resolve the ip
address to its name? And if it fails does it run the job as nobody?
2. How do I see with what account the job is tried to run as? I'm
guessing that the job is run as nobody while it is supposed to be
running as bala. How do I check it?
Balamurali Ananthan (bala@xxxxxxxxxx) (720.974.1843)
Tech-X Corp, 5621 Arapahoe Ave, Suite A, Boulder, CO 80303