[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Memory accounting issue with cgroups



The following issue started occurring at one of the 10.x releases (not certain which, but still present in 10.4.3), installed from .debs on nodes running Ubuntu 22.04.

My config used to have "CGROUP_MEMORY_LIMIT_POLICY = hard" and "use POLICY : Hold_If_Memory_Exceeded". Jobs were correctly put on hold when they exceeded their request_memory.

Now, with the same config and the same jobs, they eventually all go on Hold with "memory usage exceeded request_memory", while their actual consumption (USS, PSS and RSS as reported by smem) never exceed request_memory.

Their job log shows image size updates every 5 minutes, with RSS steadily increasing by 1GB/5mn. Once this exceeds request_mem, they (correctly) go into hold - except that their actual RSS never went beyond 2GB. When I remove the 'use POLICY' config, the jobs continue unbounded.

Looking in the cgroup context of the job's (dynamic) slot, it seems that Condor takes 'memory.current' to be the RSS. This would be correct if the job were under (severe) memory pressure, but (and this seems to be the crux of the issue) both 'memory.high' and 'memory.max' are set to "max" (and the machine has loads of memory). The Condor docs suggest that memory.high and memory.max should be at 90% and 100% of request_memory.

In fact, when I "cat memory.current | sudo tee memory.high", then memory.current and the RSS reported by Condor stay at that same level throughout, which presumably is precisely how this was supposed to work. (Very elegant mechanism!)

Not sure where to look for diagnostics, but I see one ominous message in the slot's StarterLog: "Error while locating memcg controller for starter: 50014 Cgroup not initialized". This is the tail of that log:

 05/18/23 09:50:58 (pid:1120439) Starting a VANILLA universe job with ID: 4574.0
 05/18/23 09:50:58 (pid:1120439) Checking to see if htcondor is a writeable cgroup
 05/18/23 09:50:58 (pid:1120439) Cgroup /htcondor is useable
 05/18/23 09:50:58 (pid:1120439) Current mount, /tmp, is shared.
 05/18/23 09:50:58 (pid:1120439) Current mount, /, is shared.
 05/18/23 09:50:58 (pid:1120439) IWD: /var/lib/condor/execute/dir_1120439
 05/18/23 09:50:58 (pid:1120439) Output file: /var/lib/condor/execute/dir_1120439/_condor_stdout
 05/18/23 09:50:58 (pid:1120439) Error file: /var/lib/condor/execute/dir_1120439/_condor_stderr
 05/18/23 09:50:58 (pid:1120439) Renice expr "0" evaluated to 0
 05/18/23 09:50:58 (pid:1120439) Running job as user zwets
 05/18/23 09:50:58 (pid:1120439) About to exec [... omitted ...]
 05/18/23 09:50:58 (pid:1120439) Create_Process succeeded, pid=1120441
 05/18/23 09:50:58 (pid:1120439) Error while locating memcg controller for starter: 50014 Cgroup not initialized
 05/18/23 09:51:06 (pid:1120439) Failed to open '.update.ad' to read update ad: No such file or directory (2).
 05/18/23 09:51:06 (pid:1120439) Failed to open '.update.ad' to read update ad: No such file or directory (2).


Any suggestions on where to look or what could be the issue here?

Kind regards,
Marco

--
KCRI
Marco van Zwetselaar
Bioinformatician
Kilimanjaro Clinical Research Institute
P.O. Box 2236 | Moshi, Kilimanjaro | Tanzania