[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Solaris Installation



On Wed, Mar 16, 2005 at 12:07:08PM -0500, Horsfield, Peter A wrote:
> Program received signal SIGSEGV, Segmentation fault.
> 0x2b31f8 in strlen ()
> (gdb) bt
> #0  0x2b31f8 in strlen ()
> #1  0x2acd80 in _strdup ()
> #2  0x6c82c in dirname ()
> #3  0x555cc in dirname ()
> #4  0x539a4 in dirname ()
> #5  0x534d4 in dirname ()
> #6  0x447b8 in exit ()
> (gdb)
> 
> The warning doesn't look very hopeful to me!

Yeah, that does look pretty bad. What about your NSS lookup scheme? Do you
use LDAP with SSL? You can find out in /etc/nsswitch.conf. 

> Additionally I tried truss'ing condor_master. There seemed to be an
> awful lot of messages written to /dev/null but I cannot get condor to
> write that data elsewhere. Setting MASTER_DEBUG=D_ALL didn't seem to
> produce any output either.

You could run the master in gdb again, but this time run it with "-f
-t" as command line flags. Start gdb with condor_master, then at the
(gdb) prompt type "r -f -t". But before running it, set a breakpoint on
exit(). If that breakpoint is hit before the segfault, then a backtrace
at that point might be useful. For reasons to nasty to mention, Condor
has a definition of exit() in its codebase and I want to see what is
being hit where. If you get some message about exit not being defined
yet, then set a breakpoint on main(), run it until you hit it, then set
another breakpoint on exit, and type 'continue'.

Also, do you have a "condor" user on the machine and is its uid
non-zero? Are you using a personal condor or running the daemons as
root? If you are using a personal condor, do you have the env variable
CONDOR_CONFIG set up, and if so what is it?

If you don't mind, can you make your global and local config file
available to me (you can email them personally to me in *this* instance
if you don't want others to see it).

Thank you.

-pete