[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] GPU Management



Hi Owen - 

You may want to try our 'Machine Local Limits': https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2905

It's only in 7.9.0&> .

Cheers,
Tim

----- Original Message -----
> From: "Owen Hickey" <ohickey@xxxxxxxxxxxxxxxxxxxx>
> To: htcondor-users@xxxxxxxxxxx
> Sent: Monday, March 11, 2013 7:16:05 AM
> Subject: [HTCondor-users] GPU Management
> 
> Dear Condor users and developers,
> 
> we are a research group for computational physics in Stuttgart,
> Germany. We use condor to manage a lot of our computing resources.
> Recently we have added GPUs to most of our nodes and would like to
> include those as separate resources into Condor. We have tried the
> recipe prescribed on the internet, namely putting
> 
>     SLOT1_HAS_GPU = TRUE
>     SLOT1_GPU_DEV=0
>     STARTD_ATTRS=HAS_GPU,GPU_DEV,GPU
> 
>     RANK = (target.wantGPU =?= true)*10000000
> 
> into the individual hosts configuration files. This does allow us to
> ask for machines having a GPU in the submit script.  The problem is
> that Condor launches as many jobs as there are CPU slots thus making
> the jobs run extremely slow.  What we would like to do is make it so
> that Condor tries to launch two GPU jobs per node.  We would also
> like
> to make it so that the user can request that theirs be the only GPU
> job on the node.
> 
> Any help would be very much appreciated.
> 
> Owen Hickey
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>