[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] condor_collector crashing with Signal 11



> > > > There is only one suspect, reoccuring message in the
> > CollectorLog on
> > > > our master machine:
> > > >
> > > > 3/14 11:06:50 DC_AUTHENTICATE: attempt to open invalid session 
> > > > ttc-condorsrv:7819:1110815961:10, failing.
> > > >
> > > > Not sure if this is something that can be fatal or not.
> > > > Anyone out there know?
> > >
> > > More information: it crashed even after I rolled back to
> > 6.7.3 so I'm
> > > suspecting it might be hardware at this point.
> > 
> > Do you have a core dump from a crash?  What O/S is this?
> > 
> > You can turn on core dumps by setting CREATE_CORE_FILES in your 
> > condor_config.
> 
> The site master is Linux. Redhat 9. Running 6.7.5. But we 
> have a lot of
> 6.7.3 startd/schedd's in our pool still. It was unset, does 
> that mean it defaults to true? There were no core files 
> created that I could find (not in root's home dir or under 
> any directory where condor is installed on this machine /home/condor).
> 
> I'm at the conference if you want to meet to talk about this...

I forgot to mention that we replaced the hardware and it's still
segfaulting.

- Ian