[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Job disconnected



On Aug 23, 2017, at 9:16 AM, Hervà Lemaitre <herve.lemaitre@xxxxxxxxx> wrote:

so , I installed condor in another place too and I have the same issue with the submit and execute node on ubuntu.
When I run :
sudo journalctl -u condor
I get:
systemd[1]: condor.service: Watchdog timeout (limit 20min)!

I did the MASTER_DEBUG = D_ALL
and I get that every 20 mins:

Are you have to install debug symbols for the HTCondor binaries? That would be very helpful in debugging these crashes.
One way to do that is to download the tarball of unstripped binaries for Ubuntu 14 from our website. If you have any experience with gdb, then you can load a core file there (or start condor_master from within gdb) and get more precise information about why itâs crashing.

Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project