[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor 6.9.2 hung schedd



On Wed, Jun 13, 2007 at 08:59:59AM -0700, Stuart Anderson wrote:
> > You could use condor_qedit to change the value of UserLog for the 
> > problematic jobs and then remove them.

did hang the same way...

> You can also use lsof on the hung schedd processes to find the offending
> file, move it to the side and restart schedd. That has worked for us in
> the past.

That did the trick (it's a bit time-consuming if you have to remove dozens
of files, and are not allowed to fiddle with the directory they're in),
Thanks Stuart!

Question to Condor developers: where's the status of submitted jobs kept
over a restart of condor_schedd? It might be easier to make changes there...
And why doesn't 'condor_restart -sub schedd' work in this case?

Steffen