[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] out-of-memory event?



On 10/26/2017 08:10 AM, Michael Di Domenico wrote:
  the jobs were failing on only a few
specific hosts and at exactly the same time everyday.  turns out there
is a cronjob on those machines that does 'systemctl restart
gdm.service'

it's not clear exactly why restarting gdm kills off the jobs,


Who is starting Condor on these machines? If condor was started from a shell, I could understand this error. If somehow, systemd thinks it is the owner of the condor cgroups, and is destroying the active cgroups out from under condor, that would explain this error as well.

-greg