[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Why would a native program run fine when executed directly, but fail with a seg fault when submitted through condor



Some suggestions:

- an environment issue (print out your ENV when running directly and
when running through condor .. you can "condor_run set") .. compare
envs and see if there's something that stands out.

- a memory availability issue (running out of memory in condor -
because it's enforced via ulimit in the job wrapper, but not enforced
when run directly)

- Erik



On Mon, Apr 18, 2011 at 12:16 PM, Shrader, Joshua H.
<Joshua.Shrader@xxxxxxxxxx> wrote:
> This is a question that I posted at stackoverflow...
>  http://stackoverflow.com/q/5705626/370562
>
> I have a third party library that I'm attempting to incorporate into a
> simulation.  We have the static library (.a), along with all of it's runtime
> dependencies (shared objects).  I've created a very simple application (in
> C) that is linked against the library.  All it does is call an
> initialization function that is part of the third party library's API, and
> exits.  When I run this directly from the command line, it works fine.  If I
> submit the executable to our Condor grid, it fails with a seg fault on
> strncpy (libc.so.6).  I've forced condor to only run the executable on a
> particular machine, and if I run it directly on that machine, it works fine.
>
> I'm mostly a Java programmer...  limited amount of native coding experience.
>  I'm familiar with tools such as nm, ldd, catchsegv, etc... to the point
> where I can run them.  I don't really know where to start looking for an
> issue though.
>
> I've run ldd directly on the executing machine, and via a script submitted
> through condor, along with my executable.  ldd reports the same files in
> both cases.
>
> I don't understand how running it directly would work, but it would fail
> being run by condor.  The process that ultimately executes the program,
> condor_startd, is a process that starts as root, and changes its effective
> uid to the submitter.  Perhaps this has something to do with it?
>
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>
>