[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Troubleshooting process



condor_status on a client returns nothing.  The clients only have condor_master
running and nothing else.  condor_startd keeps exiting with result '4'. 
condor_config_val returned "MASTER,STARTD".  The demons are trying to connect to
 the server at 172.16.0.1.  This is correct.

Maybe I have the firewall misconfigured.  I thought I just disabled everything
firewall related, as this is a totally isolated network.


Another error I've found is that the file "condor_starter.pvm" is missing. 
Where can I find that?


Quoting Jaime Frey <jfrey@xxxxxxxxxxx>:

> On Mon, 1 Mar 2004 kge2@xxxxxxxx wrote:

> * Run condor_status and see if the hostname appears in the output. If
> the
> hostname doesn't appear, then Condor isn't aware of it as an execute
> node.
> 
> * On the machine, run ps and look for any process named condor_startd.
> condor_startd is the daemon that makes a machine an execute node.
> 
> * On the machine, run condor_config_val -master DAEMON_LIST and see if
> "STARTD" appears in the results. This will tell you if Condor is
> configured to run the condor_startd daemon. You can also look in the
> config file (which is where you'd change DAEMON_LIST if STARTD is
> listed).
> 
> * On the machine, look for a file StartLog in the Condor log directory.
> If
> it's present and has recent entries in it, the condor_startd is
> probably
> running.
> 
> As for what interface the Condor daemons are using, every Condor
> daemon
> writes something like the following to its log when it starts:
> 
> 12/31 16:58:38 ******************************************************
> 12/31 16:58:38 ** condor_master (CONDOR_MASTER) STARTING UP
> 12/31 16:58:38 ** $CondorVersion: 6.6.1 Dec 30 2003
> RH9-BRANCH-PRE-RELEASE $
> 12/31 16:58:38 ** $CondorPlatform: I386-LINUX-RH9 $
> 12/31 16:58:38 ** PID = 7125
> 12/31 16:58:38 ******************************************************
> 12/31 16:58:38 Using config file: /some/path/name
> 12/31 16:58:38 Using local config files: /some/other/path/name
> 12/31 16:58:38 DaemonCore: Command Socket at <128.105.111.110:32873>
> 
> That last line tells you the ip:port the daemon is listening on. All
> outgoing connections will be made on the same network interface. If it
> reads 127.0.0.1, you're going to have problems. :-)
> 
> +------------------------------------+-------------------------------+
> |             Jaime Frey             |There are 10 types of people in|
> |         jfrey@xxxxxxxxxxx          |the world: Those who understand|
> |   http://www.cs.wisc.edu/~jfrey/   |  binary, and those who don't  |
> +------------------------------------+-------------------------------+
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
> 
> 

Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>