[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] startd hangs when using job hooks



Hi Michael,

> In my case the issue mysteriously resolved itself by changing the way
> STDIN was read in the fetch script. I had been doing, in Perl, a join()
> on STDIN. When I switch to using a while(<STDIN>) and appending the
> input to a temporary string the issue went away. Unfortunately, I was
> not able to create a simple case where I could isolate what was causing
> the issue. In testing just a join() versus a while() in a fetch script
> it didn't exhibit the startd hang in either case.

I'm going to spend some time this afternoon with 7.4.1 trying to
isolate this. I'm reading STDIN with:

parse_condor_slot_information_and_populate_global_vars(<STDIN>);

Which is essentially the same as:

my @array = <STDIN>;
my_function(@array);

So I'm pulling all STDIN into an array first. I'll try modifying my
function so it reads STDIN in a while loop instead. Thanks for the
tip.

What's weird is that, looking at my hook file's log output, I can see
hooks trying to hand off work to Condor. But only 3 out of 8 of them
try and Condor never seems to get the work. I'm just print'ing the
class ad to STDOUT. You?

> As an additional note, I was seeing the exact same errors as the
> previous bug hanging startd with just the simple 'exit 0' fetch hook.

- Ian