[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Running multiple jobs on the same GPU



From what I can tell, this isnât possible in a straightforward way.

 

With CPU cores, theyâre fungible, so if you want to assign half a core to a job you can just set the machineâs total CPU count to 2x what it actually is, and then have a job request one CPU, which means it will get half of one.

 

However, due to $CUDA_VISIBLE_DEVICES which is used to inform the job which GPU to use, the GPUs are not fungible, so if you double-advertised the GPUs you wouldnât get CUDA0, CUDA0, CUDA1, CUDA1, but 0,1,2,3 instead.

 

Perhaps you could do something with a user job wrapper script to remap the visible devices on machines with double-advertised GPUs? Transform CUDA1 to CUDA0, and CUDA0,CUDA1 to CUDA0, etc?

 

NVIDIAâs CUDA 9.1 package introduces a new service that partitions GPUs in the driver, so I think weâre starting to get to the point where weâll need to see GPUs as partitionable resources. Iâve been meaning to experiment with that feature to see how one would go about advertising it to the collector.

 

                -Michael Pelletier.

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Vaurynovich, Siarhei
Sent: Wednesday, April 25, 2018 9:49 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] [HTCondor-users] Running multiple jobs on the same GPU

 

 

Hello,

 

Could you please help me to figure out how to configure HTCondor to run multiple processes using the same GPU? Is it possible at all? Each process is rather light using <=20% of the GPU but there are many of them. I can certainly run more than one of them in parallel.

 

I restricted my processes to use only 1/3 of the GPU memory and provided in my submit file:

 

request_GPUs = 0.333

 

But HTCondor still only runs one GPU using process at the same time. Of course, I could restrict the slot numbers and not tell HTCondor that I will be using GPU, but I was wondering if there is a better solution.

 

Thank you for your help,

Siarhei.

 

 

............................................................................

Trading instructions sent electronically to Bernstein shall not be deemed
accepted until a representative of Bernstein acknowledges receipt
electronically or by telephone.  Comments in this e-mail transmission and
any attachments are part of a larger body of investment analysis. For our
research reports, which contain information that may be used to support
investment decisions, and disclosures see our website at
www.bernsteinresearch.com.

For further important information about AllianceBernstein please click here
http://www.abglobal.com/disclaimer/email/disclaimer.html