[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Daemon problems



On Fri June 17 2005 7:02 am, Alexandre Badez wrote:
> Thanks Nick, but actually, my node1 do not want to execute the negociator
> (don't know why) in the Master's log file, it's only written that the
> negociator failed to execute and will retry later...

Ah, OK.

Is there a NegotiatorLog created at all?  If so, look at it to see what it's 
complaining about; send a snippet along if you can't identify the problem.  
You probably want to turn on D_FULLDEBUG in the NEGOTIATOR_DEBUG macro.

If not, let's try a couple more things:

1. Turn on D_FULLDEBUG in the MASTER_DEBUG macro and restart the master.  Does 
this provide any more clues?

2. run: condor_config_val NEGOTIATOR
Verify that the executable it points at is correct.

3. Run the negotiator directly: `condor_config_val NEGOTIATOR` -t -f 
It should print some output something like this:

6/17 08:43:22 ******************************************************
6/17 08:43:22 ** condor_negotiator (CONDOR_NEGOTIATOR) STARTING UP
6/17 08:43:22 
** /afs/cs.wisc.edu/unsup/condor-production/condor-6.7.8-1/i386_rh9/sbin/condor_negotiator
6/17 08:43:22 ** $CondorVersion: 6.7.8 Jun  5 2005 $
6/17 08:43:22 ** $CondorPlatform: I386-LINUX_RH9 $
6/17 08:43:22 ** PID = 25541
6/17 08:43:22 ******************************************************
6/17 08:43:22 Using config file: /var/home/condor/condor_config
6/17 08:43:22 Using local config 
files: /unsup/condor/etc/condor_config.hosts /unsup/condor/etc/condor_config.global /unsup/condor/etc/condor_config.policy /unsup/condor/etc/condor_config.platform /unsup/condor/etc/condor_config.afs_sysname /unsup/condor/etc/hosts/chopin.local
6/17 08:43:22 DaemonCore: Command Socket at <128.105.121.21:47526>

You can hit CTRL-C to kill it once it starts.

> On the contrary, there is no problems on my others node. So I use my node2
> as central manager, and it seems to work great now. But I wonder why I
> can't execute the negociator on my node1. Indeed, my nodes are quiet
> exactly the same (same hardware, same OS, same configuration), the only
> difference is that on my node1 I share a folder by NFS with oter node.
> Maybe a bug ?

I'm not sure what you mean here, but NFS sharing shouldn't be a problem unless 
you're doing something really weird or have set type of setup problem.

> I'm searching for more information about this.

Let's see if any of the above helps

-Nick

-- 
           <<< The answer is out there, Neo. >>>
 /`-_    Nicholas R. LeRoy               The Condor Project
{     }/ http://www.cs.wisc.edu/~nleroy  http://www.cs.wisc.edu/condor
 \    /  nleroy@xxxxxxxxxxx              The University of Wisconsin
 |_*_|   608-265-5761                    Department of Computer Sciences