[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] "Failed to open as standard output" error




On Tue, 14 Feb 2006, Jaime Frey wrote:

> On Feb 10, 2006, at 3:56 PM, Ilya Narsky wrote:
> 
> > We installed condor-6.7.13.x86_rh_9 on a testbed cluster at
> > Caltech. Now I am trying to submit a globus job:
> >
> > [narsky@citgrid3 OSG]$ globus-job-run
> > citgrid3.cacr.caltech.edu:2119/jobmanager-condor /bin/date
> >
> > The job becomes idle and never finishes. StarterLog.vm1 on the worker
> > node shows this error:
> >
> ...
> > 2/10 12:21:05 Failed to open
> > '/home/narsky/.globus/job/citgrid3.cacr.caltech.edu/ 
> > 18520.1139599210/stdout'
> > as standard output: No such file or directory (errno 2)
> > 2/10 12:21:05 Failed to open
> > '/home/narsky/.globus/job/citgrid3.cacr.caltech.edu/ 
> > 18520.1139599210/stderr'
> > as standard error: No such file or directory (errno 2)
> 
> Is /home/narsky/.globus/job on a shared filesystem?

Yes, nfs.

We reinstalled condor (mostly because we wanted to move it to a different
location), and now there is another error about UidDomain before the old
'Failed to open' error in StarterLog.vm1 for the worker node.
UID_DOMAIN is set to 'local' in both headnode and global condor_config 
files.

Thanks in advance,  -Ilya

2/14 11:02:21 Using config file: /home/condor/condor_config
2/14 11:02:21 Using local config files: 
/apps/condor/condor/hosts/compute-0-0/condor_config.local
2/14 11:02:21 DaemonCore: Command Socket at <192.168.0.253:33937>
2/14 11:02:21 Done setting resource limits
2/14 11:02:21 Communicating with shadow <192.168.0.254:49538>
2/14 11:02:21 Submitting machine is "citgrid3.cacr.caltech.edu"
2/14 11:02:21 ERROR: the submitting host claims to be in our UidDomain 
(local), yet its hostname (citgrid3.cacr.caltech.edu) does not match
2/14 11:02:21 Starting a VANILLA universe job with ID: 5.0
2/14 11:02:21 IWD: /home/narsky
2/14 11:02:21 Failed to open 
'/home/narsky/.globus/job/citgrid3.cacr.caltech.edu/5231.1139943730/stdout' 
as standard output: Permission denied (errno 13)
2/14 11:02:21 Failed to open 
'/home/narsky/.globus/job/citgrid3.cacr.caltech.edu/5231.1139943730/stderr' 
as standard error: Permission denied (errno 13)
2/14 11:02:21 Failed to open some/all of the std files...
2/14 11:02:21 Aborting OsProc::StartJob.
2/14 11:02:21 Failed to start job, exiting
2/14 11:02:21 ShutdownFast all jobs.