[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor 6.6.2 suddenly started seg faulting



Hi,

I have condor 6.6.2 running as the central master on a
SuSE 9.0 box, and it's been running great for the last
several months.

Now, today it suddenly stopped working: when trying to start
any condor command or when trying to start the daemons,
I immediately get a seg fault.. When doing an strace on
the command, the last line I get is something about
/etc/nsswitch.conf:

open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20^\1\000"...,
512) = 5
12
fstat64(3, {st_mode=S_IFREG|0755, st_size=1469811, ...}) = 0
old_mmap(NULL, 1264932, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40031000
old_mmap(0x4015f000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED,
3, 0x12
e000) = 0x4015f000
old_mmap(0x40164000, 7460, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONY
MOUS, -1, 0) = 0x40164000
close(3)                                = 0
open("/lib/ld-linux.so.2", O_RDONLY)    = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20\f\0"..., 512)
= 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=112347, ...}) = 0
old_mmap(NULL, 101880, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40166000
old_mmap(0x4017e000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
0x170
00) = 0x4017e000
close(3)                                = 0
munmap(0x40000000, 79417)               = 0
brk(0)                                  = 0x8355000
brk(0x8376000)                          = 0x8376000
brk(0)                                  = 0x8376000
open("/etc/nsswitch.conf", O_RDONLY)    = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1291, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0
x40000000
read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1291
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x40000000, 4096)                = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

There's been some work on DNS and stuff, but as far
as I can see everything on that side is all working
smoothly.. Nothing in the machine itself has been
updated and nothing else seems broken..

Anybody any ideas ??? This is a production system, so I'd really
like to get it back up sooner rather than later...

Thanks in advance for any help!!

- Filip