[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] systemd interfering with Condor job cgroups



FYI, in addition to the systemd setting of Delegation=yes, which is now the default out-of-the-box configuration for 8.6.12, it was also necessary to revert to the default setting of BASE_CGROUP to further reduce the number of user processes that escape their assigned Condor dynamic slot cpu cgroup. After making this change a medium sized pool with O(10k) slots has gone from O(100) after a full cluster restart to 0 observed in the first 48 hours of running.

Note, if you change BASE_CGROUP you also need to double check any other cgroup settings you have in place to match that, e.g., for memory limits.

Thanks.

> On Jul 1, 2018, at 9:56 AM, Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:
> 
> Hi Stuart,
> 
> ah, good to know - I can confirm, that systemd-cgtop shows the correct
> hierarchy, i.e., the kernel's view.
> Consistency with systemd-cgls would indeed be nice... ;)
> 
> Cheers and thanks,
>  Thomas
> 
> On 2018-06-30 19:34, Stuart Anderson wrote:
>> 
>> Thomas, Greg,
>> 	With Delegation=yes on an SL7.5 system I also see systemd-cgls apparently showing all of the condor dynamic slot processes as if they where running directly in the condor.service group. However, systemd-cgtop shows the expecgted hierarchy.
>> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

--
Stuart Anderson  anderson@xxxxxxxxxxxxxxxx
http://www.ligo.caltech.edu/~anderson