[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Fractional GPU



Hello,

AFAIK condor doesn't support fractional GPUs.Â

Probably you will find the following response from John on a similar topic helpful.

https://www-auth.cs.wisc.edu/lists/htcondor-users/2024-February/msg00037.shtml

Thanks & Regards,
Vikrant Aggarwal


On Fri, Feb 23, 2024 at 4:38âAM Larry Martell <larry.martell@xxxxxxxxx> wrote:
Proceeding under the assumption that condor does not directly support
fractional GPUs, I am trying what I read here:
https://www-auth.cs.wisc.edu/lists/htcondor-users/2020-December/msg00018.shtml:

>You can get HTCondor to do this just by having the same device show up more than once in the device enumeration.
>For instance, if you have two GPUs and your configuration is
>MACHINE_RESOURCE_GPUS = CUDA0, CUDA1
>You can run two jobs on each GPU by configuring
>MACHINE_RESOURCE_GPUS = CUDA0, CUDA1, CUDA0, CUDA1

I have 1 GPU and this is what I have in my config file:

#use feature:GPUs
#GPU_DISCOVERY_EXTRA = -extra
MACHINE_RESOURCE_GPUs = CUDA0, CUDA0, CUDA0, CUDA0

and this env setting: CUDA_VISIBLE_DEVICES="0"

But when I run multiple jobs requesting a GPU they run serially, not
in parallel.

Has anyone been able to get something like this working?

On Thu, Feb 22, 2024 at 3:53âPM Larry Martell <larry.martell@xxxxxxxxx> wrote:
>
> Does condor support fractional GPUs? I am setting request_GPUs = 0.25
> and it is matching (I can see that with -better-analyze and in the
> StartLog) but the job never runs, it stays in idle state.

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/