[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] startd hangs when using job hooks



I am trying to implement a set of fetch and prepare hooks. However, when 
testing the hooks I experience hangs of condor_startd. When startd hangs 
it quits responding to requests and condor shutdowns. Only a process 
level kill ends the process.

The host running the hooks is a Windows Vista host running Condor 7.4.1. 
The prepare hook does take some time to run (on the order of minutes). 
However, startd does not always hang during the prepare hook. Sometimes 
startd hangs after the job begins executing, sometimes it doesn't hang 
at all.

Has anyone else seen similar behavior? Was there a way to work around 
the problem? Apparently, there was a similar problem in 7.3.2 and prior 
where a very simple fetch hook would cause startd to hang. I haven't 
figured out what portion of the hook triggers this behavior, it's very 
intermittent.
 
Thanks,
Michael Moore