[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Issues regarding use of NVIDIA MIGs with HTCondor
- Date: Fri, 8 Apr 2022 10:43:43 -0500 (CDT)
- From: Todd L Miller <tlmiller@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Issues regarding use of NVIDIA MIGs with HTCondor
For (4), I'll defer to the GPU experts on the list.
(4) is a known problem. We've been working on higher-priority
improvements to our GPU support, much of which will appear in the
upcoming 9.8.0 release.
For work-arounds, if the MIG GPU(s) are the only ones on the
system, it's fairly easy for the start to enforce that only one GPU is
given to each job (START = $(START) && RequestGPUs <= 1). It becomes much
more complicated if you have non-MIG GPUs mixed in (you have to isolate
the MIG GPUs to their own partitionable slot).
With 9.8.0, you should also be able to have the
multi-GPU-requesting jobs specify that they require non-MIG GPUs,
although the required expressions may be clumsy.