[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] CondorG pools



Thanks Steve.

I tried to share the /home of my Ubuntu machines.  The VMware I am using
doesn't let me to share virtual hard drives.  

I cannot find any way to share the home directory.  

How do I go about either changing location defined in
GLOBUS_REMOTE_IO_URL (or more importantly changing the shared area) or
turning off the stdout altogether?

I tried using samba to link /home/condor to my host machine, but logging
in indicated some errors and appears to have a lot of permission
problems.

Kevin
 
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Steven Timm
Sent: Wednesday, November 17, 2010 7:46 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] CondorG pools

Do the two nodes, manager and the other one, have a shared set
of home directories for the Globus user?  Globus expects
that the stdout, stderr, and proxy files will be exported to
the other worker nodes via NFS, that's what the .globus
directory is for in each user's id.
You can change this
if you know what you're doing and hack the globus scripts
to have condor transfer the files to the worker node, but it's tricky.
(tricky enough that I haven't tried it yheteven though I have several
globus-condor pools and have been running them for 6 years).  You
are looking for a perl module called condor.pm about 6 directories
deep in the globus software.

Steve


On Wed, 17 Nov 2010, Roy, Kevin (LNG-SEA) wrote:

> I am using condor 7.4.4 on two VMware Ubuntu machines.
>
>
>
> I have setup Globus and can submit and run jobs.  I have setup Condor
> and can submit and run jobs.  If there is only one machine I can use
> Globus to run jobs on condor and vice versa.
>
>
>
> When I add a second machine and issue a submit with 5 jobs.  2-3 goes
to
> one machine and rest to the other machine.  On the manager machine the
> jobs run without a problem.  On the second machine the jobs appears to
> start to run and are put on hold...   For the following reason (from
> condor_q -better-analyze):
>
>
>
> Hold reason: Error from helium.adiroy.com: Failed to open
>
'/home/globus/.globus/job/hydrogen.adiroy.com/16073795612117631466.25882
> 26358823932351/stdout' as standard output: No such file or directory
> (errno 2)
>
>
>
> The directory does not exist and if I create it does nothing.  I have
> opened my ports for GLOBUS_TCP_PORT and this too did nothing.  I have
> searched quite extensively on the web but cannot find any more
> information.  Can someone help me?  Thanks in advance
>
>
>
> The job is defined as
>
> executable = /bin/hostname
>
> globusscheduler = hydrogen
>
> universe = globus
>
> output = condorg.out.$(cluster).$(Process)
>
> log = condorg.log.$(cluster).$(Process)
>
>
>
> should_transfer_files = YES
>
> when_to_transfer_output = ON_EXIT
>
>
>
> stream_output = true
>
> stream_error = true
>
>
>
> queue 5
>
>

-- 
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group
Leader.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/