[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[condor-users] condor_collector problem with 6.6.0 under IRIX



Hi chaps,

May I take this opportunity to further display my ignorance in this
forum. I'm upgrading our 6.4.5 pool to 6.6.0, with the master node being
an SGI O2 running IRIX 6.5, which has given valiant service in this role
with 6.4.5 for nearly a year. The upgrade installation goes swimmingly,
but on issuing condor_master all the relevant daemons come up except
condor_collector, so running condor_q -global gives the relevant error
message. The MasterLog has the following entry:


11/21 14:55:02 ******************************************************
11/21 14:55:02 ** condor_master (CONDOR_MASTER) STARTING UP
11/21 14:55:02 ** $CondorVersion: 6.6.0 Nov 14 2003 $
11/21 14:55:02 ** $CondorPlatform: SGI-IRIX65 $
11/21 14:55:02 ** PID = 277284
11/21 14:55:02 ******************************************************
11/21 14:55:02 Using config file:
/pond/home/condor/IRIX/current_release/etc/condor_config
11/21 14:55:02 Using local config files: /usr/condor/condor_config.local
11/21 14:55:02 DaemonCore: Command Socket at <131.111.41.187:9633>
11/21 14:55:02 Started DaemonCore process
"/home/condor/IRIX/current_release/sbin/condor_startd", pid and pgroup =
278251
11/21 14:55:02 Started DaemonCore process
"/home/condor/IRIX/current_release/sbin/condor_schedd", pid and pgroup =
278172
11/21 14:55:02 Started DaemonCore process
"/home/condor/IRIX/current_release/sbin/condor_kbdd", pid and pgroup =
276100
11/21 14:55:07 Can't connect to <131.111.41.187:9618>:0, errno = 146
11/21 14:55:07 Will keep trying for 10 seconds...
11/21 14:55:17 Connect failed for 10 seconds; returning FALSE
11/21 14:55:17 ERROR:
SECMAN:2003:TCP connection to <131.111.41.187:9618> failed


Now an errno of 146 maps to ECONNREFUSED according to <sys/errno.h>, so
on the chance that this port's already been bagged by some other
application I ran netstat -a before and after starting condor and looked
at which ports are being used. Port 9618 is not being used at all, and
the only effect of running condor is to use the following ports:

> tcp     0      0  silica.9633        *.*                 LISTEN
> tcp     0      0  silica.9657        *.*                 LISTEN
> tcp     0      0  silica.9617        *.*                 LISTEN
> tcp     0      0  silica.9609        *.*                 LISTEN
> tcp     0      0  silica.9610        *.*                 LISTEN
> udp     0      0  silica.9603        *.*
> udp     0      0  silica.9609        *.*
> udp     0      0  silica.9610        *.*
> udp     0      0  silica.9617        *.*
> udp     0      0  silica.9633        *.*
> udp     0      0  silica.9657        *.*

Not only is port 9618 not been used, but I can't even see port 9614
being taken by the negotiator. I should point out that I haven't altered
the port ranges that condor uses from the defaults.

Can any of you chaps spot anything that I'm obviously missing?

Thanks for any help,

Mark



Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>