Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] multi-gpu-nodes limit access per slot
- Date: Tue, 10 Dec 2019 17:20:48 +0000
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] multi-gpu-nodes limit access per slot
On 12/10/2019 10:52 AM, Beyer, Christoph wrote:
> Hi,
>
> I do have one 4 gpu node and wonder if there is a way to limit the usage on slot base, for ex 4 slots that just see & access each a single GPU. Are cgroups the way to do so and if yes how is it configured ?
>
> Best
> Christoph
>
Maybe on this node just configure HTCondor with four static slots, each
with one GPU and some amount of CPU/RAM? If you need partitionable
slots for some reason (e.g. RAM), you could edit your START expression
to say only jobs requesting 0 or 1 GPUs will be matched....
As for restricting access to the GPUs, HTCondor will set
CUDA_VISIBLE_DEVICES environment variable (and the OpenCL equal) to
point to the GPU provisioned to that slot. This environment variable is
honored by low-level CUDA libraries. Are you worried about GPU codes
that purposefully ignore or clear this environment variable?
regards
Todd