[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor 6.9.2 hung schedd



On Wed, Jun 13, 2007 at 11:36:44AM -0500, Dan Bradley wrote:
> 
> 
> Steffen Grunewald wrote:
> 
> >Question to Condor developers: where's the status of submitted jobs kept
> >over a restart of condor_schedd? It might be easier to make changes there...
> >  
> >
> $(SPOOL)/job_queue.log
> 
> It is fairly easy to understand the format and to make manual changes, 
> but be careful!
> 
> >And why doesn't 'condor_restart -sub schedd' work in this case?
> >  
> >
> Hmm.  It worked for me when I tried it, but I'm running a pre-release of 
> 6.9.3.  The usual problem people have is that their security 
> configuration doesn't allow condor_restart to operate from the machine 
> where they are running it, but the command-line tool does not know 
> whether the operation was rejected or not, so there is no visible 
> complaint to the user.  If you look in the schedd log, you will see a 
> message indicating that it rejected the command.
> 

One question is whether the schedd will honor a restart request when
it is blocked on a system call to obtain a file lock for a user log file?

-- 
Stuart Anderson  anderson@xxxxxxxxxxxxxxxx
http://www.ligo.caltech.edu/~anderson