[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Why would a native program run fine when executed directly, but fail with a seg fault when submitted through condor



Title: Why would a native program run fine when executed directly, but fail with a seg fault when submitted through condor
This is a question that I posted at stackoverflow...  http://stackoverflow.com/q/5705626/370562

I have a third party library that I'm attempting to incorporate into a simulation.  We have the static library (.a), along with all of it's runtime dependencies (shared objects).  I've created a very simple application (in C) that is linked against the library.  All it does is call an initialization function that is part of the third party library's API, and exits.  When I run this directly from the command line, it works fine.  If I submit the executable to our Condor grid, it fails with a seg fault on strncpy (libc.so.6).  I've forced condor to only run the executable on a particular machine, and if I run it directly on that machine, it works fine.

I'm mostly a Java programmer...  limited amount of native coding experience.  I'm familiar with tools such as nm, ldd, catchsegv, etc... to the point where I can run them.  I don't really know where to start looking for an issue though.  

I've run ldd directly on the executing machine, and via a script submitted through condor, along with my executable.  ldd reports the same files in both cases.

I don't understand how running it directly would work, but it would fail being run by condor.  The process that ultimately executes the program, condor_startd, is a process that starts as root, and changes its effective uid to the submitter.  Perhaps this has something to do with it?