[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_master dying in 7.6.9



On 18/08/12 16:38, Brian Candler wrote:
> On Sat, Aug 18, 2012 at 04:26:51PM +0100, Roderick Johnstone wrote:
>> I'm seeing my condor_master die on startup in 7.6.9 on one of my
>> systems. I'm using the x86_64 rpms from the condor rhel6 repo on rhel6.3
> ...
>> A fragment of the MasterLog file is given below. I note the WARNING
>> message, but don't see a problem with the name configuration for the
>> host thats giving this problem but I might have overlooked something. It
>> does seem strange that it should think that localhost6 is related to an
>> ipv4 address though.
> 
> What does your /etc/hosts show for localhost and localhost6? i.e. output
> from
> 
>     grep -i localhost /etc/hosts

$ grep -i localhost /etc/hosts
127.0.0.1       localhost       localhost.localdomain   localhost4
localhost4.localdomain4
::1     localhost       localhost.localdomain   localhost6
localhost6.localdomain6
127.0.0.1       localhost.localdomain   localhost
::1     localhost6.localdomain6 localhost6

$ grep -i localhost6 /etc/hosts
::1     localhost       localhost.localdomain   localhost6
localhost6.localdomain6
::1     localhost6.localdomain6 localhost6



> 
> And also for each of the interface addresses on your box,
> 
>     grep x.x.x.x /etc/hosts

For eth0 this shows x.x.x.x followed by the correct fully qualified
domain name, followed by the correct short hostname.

For eth0:1 there is no match to the ip address for that interface.

> 
> There appear to be two problems here:
> 
> 1. Whatever IP address it has found, condor has done a reverse lookup to find
>    "localhost6" and then a forward lookup found something else.
> 
>    7.6.9 is now paranoid about matching forward and reverse DNS, in case
>    you have configured access controls as *.yourdomain.com and a third
>    party decides to put foo.yourdomain.com in their reverse DNS.
> 
> 2. Condor crashing out under this situation (clearly it should either
>    continue or terminate gracefully)
> 
> Running condor_master under strace (strace -f condor_master 2>strace.txt) or
> gdb (gdb condor_master // run // bt) may show some more info.

ok thanks. I'll try this laster this weekend when I get a moment to
reinstall the crashing rpm.

Thanks for such a prompt response.

Roderick