[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] GPU Management



Hi Imre,

 

can you send me your config file too?

 

Kind regards,

Caren

 

 

-----Original Message-----

From: htcondor-users-bounces@xxxxxxxxxxx [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Imre Szeberenyi

Sent: Montag, 11. März 2013 22:12

To: HTCondor-Users Mail List

Subject: Re: [HTCondor-users] GPU Management

 

Hi Owen,

 

We have not so nice solution as Tim has:

We have defined as many slots as many different use-case we expect:

We have 12 CPU cores  and 2 Tesla cards.

And we have defined start condition for all the supported use-cases.

(Singe CPU, multi CPU, single GPU+GPU, single GPU, etc.) The config file is quite complex, but it works. I can send it you, if you are interested in. (I don't want to pollute the list with it.)

 

Cheers,

 

Imre

 

On 2013.03.11. 15:04, Tim St Clair wrote:

> Hi Owen -

> 

> You may want to try our 'Machine Local Limits':

> https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2905

> 

> It's only in 7.9.0&>  .

> 

> Cheers,

> Tim

> 

> ----- Original Message -----

>> From: "Owen Hickey"<ohickey@xxxxxxxxxxxxxxxxxxxx>

>> To: htcondor-users@xxxxxxxxxxx

>> Sent: Monday, March 11, 2013 7:16:05 AM

>> Subject: [HTCondor-users] GPU Management

>> 

>> Dear Condor users and developers,

>> 

>> we are a research group for computational physics in Stuttgart,

>> Germany. We use condor to manage a lot of our computing resources.

>> Recently we have added GPUs to most of our nodes and would like to

>> include those as separate resources into Condor. We have tried the

>> recipe prescribed on the internet, namely putting

>> 

>>      SLOT1_HAS_GPU = TRUE

>>      SLOT1_GPU_DEV=0

>>      STARTD_ATTRS=HAS_GPU,GPU_DEV,GPU

>> 

>>      RANK = (target.wantGPU =?= true)*10000000

>> 

>> into the individual hosts configuration files. This does allow us to

>> ask for machines having a GPU in the submit script.  The problem is

>> that Condor launches as many jobs as there are CPU slots thus making

>> the jobs run extremely slow.  What we would like to do is make it so

>> that Condor tries to launch two GPU jobs per node.  We would also

>> like to make it so that the user can request that theirs be the only

>> GPU job on the node.

>> 

>> Any help would be very much appreciated.

>> 

>> Owen Hickey

>> _______________________________________________

>> HTCondor-users mailing list

>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx

>> with a

>> subject: Unsubscribe

>> You can also unsubscribe by visiting

>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

>> 

>> The archives can be found at:

>> https://lists.cs.wisc.edu/archive/htcondor-users/

>> 

> _______________________________________________

> HTCondor-users mailing list

> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx

> with a

> subject: Unsubscribe

> You can also unsubscribe by visiting

> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

> 

> The archives can be found at:

> https://lists.cs.wisc.edu/archive/htcondor-users/

 

_______________________________________________

HTCondor-users mailing list

To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a

subject: Unsubscribe

You can also unsubscribe by visiting

https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

 

The archives can be found at:

https://lists.cs.wisc.edu/archive/htcondor-users/