[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] CondorG pools
- Date: Wed, 17 Nov 2010 21:45:35 -0600 (CST)
- From: Steven Timm <timm@xxxxxxxx>
- Subject: Re: [Condor-users] CondorG pools
Do the two nodes, manager and the other one, have a shared set
of home directories for the globus user? Globus expects
that the stdout, stderr, and proxy files will be exported to
the other worker nodes via NFS, that's what the .globus
directory is for in each user's id.
You can change this
if you know what you're doing and hack the globus scripts
to have condor transfer the files to the worker node, but it's tricky.
(tricky enough that I haven't tried it yheteven though I have several
globus-condor pools and have been running them for 6 years). You
are looking for a perl module called condor.pm about 6 directories
deep in the globus software.
On Wed, 17 Nov 2010, Roy, Kevin (LNG-SEA) wrote:
I am using condor 7.4.4 on two VMware Ubuntu machines.
I have setup Globus and can submit and run jobs. I have setup Condor
and can submit and run jobs. If there is only one machine I can use
Globus to run jobs on condor and vice versa.
When I add a second machine and issue a submit with 5 jobs. 2-3 goes to
one machine and rest to the other machine. On the manager machine the
jobs run without a problem. On the second machine the jobs appears to
start to run and are put on hold... For the following reason (from
Hold reason: Error from helium.adiroy.com: Failed to open
26358823932351/stdout' as standard output: No such file or directory
The directory does not exist and if I create it does nothing. I have
opened my ports for GLOBUS_TCP_PORT and this too did nothing. I have
searched quite extensively on the web but cannot find any more
information. Can someone help me? Thanks in advance
The job is defined as
executable = /bin/hostname
globusscheduler = hydrogen
universe = globus
output = condorg.out.$(cluster).$(Process)
log = condorg.log.$(cluster).$(Process)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
stream_output = true
stream_error = true
Steven C. Timm, Ph.D (630) 840-8525
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.