[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How to tie slots to cpus ?



Luis-

I wouldn't describe that as hacky, FWIW. I agree that identifying a BASE_CGROUP per slot is a good choice. For your reference, Valerio:

https://htcondor.readthedocs.io/en/v9_1/admin-manual/configuration-macros.html?highlight=BASE_CGROUP#condor-procd-configuration-file-macros

One alternative is trying cgred (RedHat) or cgrulesengd (Debian) though it's a different kind of pain on both platforms. Here is Debian:

https://www.lesbonscomptes.com/recoll/faqsandhowtos/cgroups_instructions.html

You could pre-create all the cgroup slots following the name convention.

An alternative track that might be neat: if HTCondor integrated more neatly into the SystemD hierarchy of cgroups. I believe that would look like:

htcondor.slice
Âââhtcondor-1.slice
   ââjob-X.scope
Âââhtcondor-2.slice
   ââjob-Y.scope
...

This brings along all the development work that integratesÂSystemD into cgroups extremely well (in my view). This would allow you to create files

/etc/systemd/system/htcondor-.slice.d/*.conf

that apply to all slots and

/etc/systemd/system/htcondor-1.slice.d/*.conf

that apply only to slot 1. Then HTCondor would be using systemd (and its cgroup settings) to create the scopes for dynamically-created slots. Although I bet it's slower to create a SystemD scope vs manually create/re-use a plain cgroup.

Obviously HTCondor plans on outliving SystemD. But in the meantime, it's the default manager of services and cgroups on nearly all distributions.

Tom


On Mon, Oct 11, 2021 at 1:07 PM <valerio@xxxxxxxxxx> wrote:
On Mon, 2021-10-11 at 19:15 +0200, luis.fernandez.alvarez@xxxxxxx wrote:
Hello Valerio,

I am in the process of setting up a configuration similar to what you are looking for.

Just to confirm we're talking about the same scenario, this is what I plan to deploy:
  • One partitionable slot per NUMA node ( in our machines with 2 NUMA nodes, I will set resources to 50%).
  • Each partitionable slot can define its own base cgroup, so I will define htcondor_numa0, htcondor_numa1 and assign it to each slot.

Yes! this is exactly what I am looking for, my computing nodes have 2 cpus.

Then, how to handle the numa nodes? In my case I am running CentOS 7, and they encourage using systemd. It doesn't offer all the options so it's going to be a bit hacky:

  • Create a couple of root systemd slices (htcondor_numa0, htcondor_numa1).
  • Create a companion service bound to the slice (htcondor_numa0-config.service).
  • This service is in charge of setting the cpusets & mems to the cgroup.
  • Finally, I add an extra dependency in systemd to ensure that the slices are enabled and run before htcondor service.

Once I have this in place (I am planning to test this week), I can provide and shared the final config define in our cluster.


systemd is also on Debian based systems like Ubuntu (I use Ubuntu), so it should work on all those systems too.

I hope we can share our work.

cheers
Valerio




Cheers,
Luis


On 11/10/2021 09:26, valerio@xxxxxxxxxx wrote:
This is an example of how to use cpusets with shell:

mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset
cd /sys/fs/cgroup/cpuset
mkdir Charlie
cd Charlie
/bin/echo 2-3 > cpuset.cpus
/bin/echo 1 > cpuset.mems
/bin/echo $$ > tasks
sh
# The subshell 'sh' is now running in cpuset Charlie
# The next line should display '/Charlie'
cat /proc/self/cpuset
I need an example of how to use cpusets for HTCondor slots.

Thanks.
Valerio



On Sun, 2021-10-10 at 08:00 -0500, tpdownes@xxxxxxxxx wrote:
Greg almost certainly wrote the thing. I must have seen (and searches for) the old ENFORCE_CPU_AFFINITY and SLOT settings Iâve used in past.

That said, I only see that ASSIGN_CPU_AFFINITY claims to pin jobs to cores, not that it will do so in a NUMA-aware way.

Tom

On Sat, Oct 9, 2021 at 9:17 PM <valerio@xxxxxxxxxx> wrote:
On Fri, 2021-09-03 at 11:07 -0500, tpdownes@xxxxxxxxx wrote:
I think it's worth adding to Greg's response that I don't believeÂASSIGN_CPU_AFFINITY will, all on its own, map jobs to NUMA nodes. I believe your best bet here is to create N partitionable slots, with N equal to the number of physical CPUs you have. Then combine ASSIGN_CPU_AFFINITY andÂSLOT<N>_CPU_AFFINITY so that each slot is mapped to NUMA nodes.

If you want to polish the doorknob, you should also look into setting cgroup cpusets on HTCondor so that it has "exclusive" access where it can and that the other top-level cgroups (e.g. system.slice) do not have access to very many cores. There obviously has to be some overlap unless you're willing to reduce the # of cores available to HTCondor.


PS: you should consider reducing the "cpuquota" available to HTCondor in its cgroup. It's always good to have 0.25 core available for ssh!


Tom


The 9.2.0 manual says that ASSIGN_CPU_AFFINITY replaces both ENFORCE_CPU_AFFINITY and SLOT<N>_CPU_AFFINITY.
Is SLOT<N>_CPU_AFFINITY still a valid configuration variable?


Thanks,
Valerio

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to 
htcondor-users-request@xxxxxxxxxxx
 with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to 
htcondor-users-request@xxxxxxxxxxx
 with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users


The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to 
htcondor-users-request@xxxxxxxxxxx
 with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users


The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/