[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] optional gpu matching



On 5/26/2017 4:56 AM, Stefan Harjes wrote:
> Hi condor users,
> 
> I was looking into this optional gpu usage post (https://www-auth.cs.wisc.edu/lists/htcondor-users/2016-June/msg00051.shtml)
> which proposed, that for jobs which would use a gpu, but can do without I could add the below line to a submission file:
> 
> request_gpus = ifthenelse(gpus =?= UNDEFINED, 0, ifthenelse(gpus > 0, 1, 0))
> 
> while the jobs containing such a request get executed on gpu containing hosts, on hosts where gpus is not defined it does not run.
> Could this be due to the possibility, that request_gpu is not defined on targets without gpus?
> 

Is the problem that your job never matches to machines with no gpus, or your job matches and then fails to run for some reason?

If it never even matches,  I think the issue is when you specify

  request_gpus = X

then condor_submit defines the RequestGpus attribute to whatever you said (this is good), and then goes on to mess things up for you by "helpfully" editing the job ad's Requirements attribute to add a clause like so:

  Requirements = ... && ( TARGET.gpus >= RequestGpus) ...

You can see this with "condor_q -l".  Then the problem is the Requirements expression evaluates to undefined for machines that do not have Gpus defined.

I think you can get the behavior you want by defining the RequestGpus attribute in your job classad directly via the "+" operator, which should result in condor_submit simply inserting the attribute into the job classad *without* messing around with the Requirements expression.  At least it works this way on my machine running HTCondor v8.7.1.

So TL;DR, I think submit file will do what you want:

  executable = foo.exe
  +RequestGpus = ifthenelse(gpus =?= UNDEFINED, 0, ifthenelse(gpus > 0, 1, 0))
  rank = gpus > 0
  queue

Let us know how it goes!

Hope the above helps,
Todd