[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Exit_hook receiving empty job classaAd

Hi all,

We have been having troubles with our JOB_EXIT_HOOKS, both in HTCondor 7.8 and in HTCondor 8.0. Some of them (and the amount is strangely  increasing with time) don't get any job classAd at all. At first we thought it could be a timeout issue (we had our share of these as well), but it doesn't seem to be the case as the hook script continues its execution. Just in case, we have set both KILLING_TIMEOUT and xxxxx_HOOK_JOB_EXIT_TIMEOUT to 300 seconds, which should be more than enough for it.

The first thing our hook script tries to do is to dump the whole classad to a file (for debugging purposes), and it is creating empty files:


TMPFILE=`mktemp /tmp/condorlog.XXXXXX`
cat > $TMPFILE

The script keeps going from there (reading the stored classad and processing it). We can see that the script tries to do its job, but it complains about not having any data to work on. That's why we have discarded the possibility of a timeout.

I found a similar report in the list from four years ago [1], but it didn't seem to get any solution. Is there anything I could do to further debug this issue?



[1]: https://lists.cs.wisc.edu/archive/htcondor-users/2009-July/msg00165.shtml
Joan Josep Piles Contreras -  Analista de sistemas
I3A - Instituto de Investigación en Ingeniería de Aragón
Tel: 876 55 51 47 (ext. 845147)
http://i3a.unizar.es -- jpiles@xxxxxxxxx