[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] schedd keeps dying



Hi Condor users and experts,

I'm seeing my condor_schedd die repeatedly with the stack trace below.
I've put the core file up at:
http://www.mwt2.org/~sarah/core
The installation was stable previously at ~1000 cores, but because
unstable when increased to 4000 cores.

--Sarah

Stack dump for process 24238 at timestamp 1322852851 (22 frames)
condor_schedd(dprintf_dump_stack+0x56)[0x66f2f6]
condor_schedd(_Z18linux_sig_coredumpi+0x4d)[0x59be2d]
/lib64/libpthread.so.0[0x339180eb10]
/lib64/libc.so.6(abort+0x28f)[0x3391031e8f]
/usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x114)[0x3095abecb4]
/usr/lib64/libstdc++.so.6[0x3095abcdb6]/usr/lib64/libstdc++.so.6[0x3095abcde3]
/usr/lib64/libstdc++.so.6[0x3095abceca]
/usr/lib64/libstdc++.so.6(_Znwm+0x79)[0x3095abd1d9]
condor_schedd(_ZN13_condorOutMsgC1Ev+0x1b)[0x5ef02b]
condor_schedd(_ZN8SafeSockC1Ev+0x36)[0x5e3186]
condor_schedd(_ZN10DaemonCore14Create_ProcessEPKcRK7ArgList10priv_stateiiPK3EnvS1_P10FamilyInfoPP6StreamPiSE
_iP10__sigset_tiPmSE_S1_P8MyString+0x18a)[0x59372a]
condor_schedd(_ZN9Scheduler18spawnJobHandlerRawEP10shadow_recPKcRK7ArgListPK3EnvS3_bbb+0x225)[0x5275b5]
condor_schedd(_ZN9Scheduler11spawnShadowEP10shadow_rec+0x2e4)[0x5392f4]
condor_schedd(_ZN9Scheduler15spawnJobHandlerEiiP10shadow_rec+0xa0)[0x5398c0]
condor_schedd(_Z26aboutToSpawnJobHandlerDoneiiPvi+0xe2)[0x539ae2]
condor_schedd(_ZN9Scheduler15StartJobHandlerEv+0x13e)[0x53a15e]
condor_schedd(_ZN12TimerManager7TimeoutEv+0x155)[0x5a18f5]
condor_schedd(_ZN10DaemonCore6DriverEv+0x248)[0x58ed78]
condor_schedd(main+0xe47)[0x59e5a7]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x339101d994]
condor_schedd(__gxx_personality_v0+0x411)[0x4fa779]