[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] glidein resources not found




Make sure you have full connectivity (both inbound and outbound) between the compute nodes and your main Condor cluster (at least the machine running the schedd and the collector/negotiator). If you have to deal with a firewall or machines with multiple network interfaces, then things get more complicated. See http://www.cs.wisc.edu/condor/manual/v6.6/3_10Setting_Up.html.


Also make sure that your security policy in condor_config allows full access to the machines where glidein will be running. For example, you may need to adjust HOSTALLOW_WRITE and HOSTALLOW_READ.

--Dan

Arun Nayar wrote:

That started the master but now I get a connection refused error in the
master and startd logs on the resource, is this a firewall issue

12/7 15:47:39 vm1: Error sending update to collector(s)
12/7 15:47:39 Can't connect to <130.39.128.94:9618>:0, errno = 111
12/7 15:47:39 Will keep trying for 10 seconds...
12/7 15:47:49 Connect failed for 10 seconds; returning FALSE
12/7 15:47:49 ERROR: SECMAN:2003:TCP connection to <130.39.128.94:9618>
failed


--Arun


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
Sent: Tuesday, December 07, 2004 3:26 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] glidein resources not found


Arun Nayar wrote:




After running condor_glidein, I don't see the resource using condor_status. Is something going wrong? I can do simple condor-g jobs to this resource using universe=globus. I don't see any of the daemons started on the remote resource, why is that, how can I debug this problem? I also tried manually running condor_master -dyn -f on the globus resource and it gave me a library not found error. Doesnt glidein automatically decide which tarball is to be used for the remote resource?




To see what is happening to the glidein daemons, you may need to look in their log files. By default, these are stored on the remote resource in $(HOME)/Condor_glidein/local/log.


If running the daemons by hand is producing link errors, glidein may be choosing the wrong architecture. Currently, it does not auto-detect the glibc library version. Looking at the glidein tarball directory, I see the following choices for Linux:

6.6.7-i686-pc-Linux-2.4.tar.gz
6.7.2-i686-pc-Linux-2.4.tar.gz
6.7.2-i686-pc-Linux-2.4-glibc2.2-static.tar.gz
6.7.2-i686-pc-Linux-2.4-glibc2.2.tar.gz

The versions without a glibc designation are for glibc2.3. If you are running on glibc2.2 machines, I would recommend trying 6.7.2-i686-pc-Linux-2.4-glibc2.2.tar.gz. Since you are running an older version of condor_glidein, you would need to rename this to the version/architecture that glidein auto-chooses and you would need to use -forcesetup to force the existing installation to be overwritten.

--Dan


_______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx http://lists.cs.wisc.edu/mailman/listinfo/condor-users

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users