Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] out-of-memory event?

Date: Thu, 26 Oct 2017 15:33:26 -0500
From: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] out-of-memory event?

On 10/26/2017 08:10 AM, Michael Di Domenico wrote:

  the jobs were failing on only a few
specific hosts and at exactly the same time everyday.  turns out there
is a cronjob on those machines that does 'systemctl restart
gdm.service'

it's not clear exactly why restarting gdm kills off the jobs,

Who is starting Condor on these machines?Â If condor was started from ashell, I could understand this error.Â If somehow, systemd thinks it isthe owner of the condor cgroups, and is destroying the active cgroupsout from under condor, that would explain this error as well.


-greg

Follow-Ups:
- Re: [HTCondor-users] out-of-memory event?
  - From: Michael Di Domenico

References:
- [HTCondor-users] out-of-memory event?
  - From: Michael Di Domenico
- Re: [HTCondor-users] out-of-memory event?
  - From: Michael Di Domenico
- Re: [HTCondor-users] out-of-memory event?
  - From: Michael Di Domenico

Prev by Date: Re: [HTCondor-users] out-of-memory event?
Next by Date: [HTCondor-users] condor_submit how to avoid bottleneck
Previous by thread: Re: [HTCondor-users] out-of-memory event?
Next by thread: Re: [HTCondor-users] out-of-memory event?
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] out-of-memory event?