[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Disk I/O control with cgroup:blkio?



Hi Dimitri,

many thanks for the info! I will try your config on our SL6 machines.

I guess the cgconfig.d conf works only with 2.6 and for systemd one
would need a drop in. And as for systemd the syntax is somewhat
different (not sure if also better...)
From what I just learnt the systemd related options are probable now
  IO{Read,Write}{Bandwidth,IOPS}Max
with the device selected by its /dev path [1]. I *assume* that these get
translated by systemd into the standard cgroup parameters??
Anyway, I am just testing something like [2] but so far the limits seem
not to be propagated towards the parent condor cgroup or its slot
subgroups [3] :-/
Have to fiddle a bit more with systemd...

Cheers and thanks,
  Thomas


[1]
https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html
I stumbled over
https://www.certdepot.net/rhel7-get-started-cgroups/
but the options the article uses (CPUShares and BlockIOWeight) seem to
be lagacy nowadays(??)

[2]
> /etc/systemd/system/condor.service.d/20-blkio.conf
[Service]
IOReadBandwidthMax=/dev/disk/by-uuid/abcdef-12345-6789  12345678
IOWriteBandwidthMax=/dev/disk/by-uuid/abcdef-12345-6789  12345678

[3]
>
/sys/fs/cgroup/blkio/system.slice/condor.service/blkio.throttle.read_bps_device

> cat
/sys/fs/cgroup/blkio/system.slice/condor.service/condor_var_lib_condor_execute_slot1_2@xxxxxxxxxxxxxxxxx/blkio.throttle.*
8:0 Read 0
8:0 Write 0
8:0 Sync 0
8:0 Async 0
8:0 Total 0
Total 0
8:0 Read 0
8:0 Write 0
8:0 Sync 0
8:0 Async 0
8:0 Total 0
Total 0


On 2017-08-30 19:43, Dimitri Maziuk wrote:
> We had jobs fail because of too much unzip/untarring and I added
> 
> /etc/cgconfig.d/condor.conf:
> group htcondor {
>     cpu {}
>     cpuacct {}
>     memory {}
>     freezer {}
>     blkio {
>         blkio.throttle.write_bps_device = "8:0 104857600
> 8:16 104857600";
>     }
> }
> 
> The errors seems to have disappeared since.
> 
> Note that you have get the major:minor for each disk you want to
> throttle on each node which could be a bit of a PITA. And the newline
> syntax is silly, but that's how you specify multiple disks.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature