[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Use of cgroups



Hi Peter,

On 28/08/2023 03:56, Peter Ellevseth wrote:
> I am strugglingÂto understand how the cgroup mechanism affects my 
> jobs. I have a added a new fresh node to our cluster. I have starting 
> a lot of jobs on it, but all of sudden it starts killing my jobs. I 
> have traced it back to the OOM killer. However, the execute machine 
> has 250GB of memory and my jobs are not using close to that. 

This looks like the issue with cgroups2 discussed in the thread "Memory 
accounting issue with cgroups"[1]. This was fixed in 10.6.

Cheers
Marco

[1] 
https://www-auth.cs.wisc.edu/lists/htcondor-users/2023-May/msg00080.shtml