[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] about START and RANK (and GPUs)



This brings to mind a puzzle I'm in the middle of.

At what point do the job rank and the negotiator pre- and post-job rank expressions come into play?

I'm trying to get my GPU machines to prefer GPU jobs, but I seem to have something backwards - non-GPU jobs are going to the GPU machines first.  Should I use machine rank or the pre/post-job rank configuration here? My current attempt is using negotiator_post_job_rank.

As was pointed out, the machine rank also implicates job preemption - Todd's "California marriage" quip from HTCondor Week this year - so I'd need to insure that is correctly configured.

Do I need to do both sides of the coin? Have non-GPU machines prefer non-GPU jobs as well as GPU machines preferring GPU jobs? 

But then I remember that GPU jobs won't match non-GPU machines, so the "jobs-with-GPU" rank doesn't need to address non-GPU jobs, right? That would focus on which GPU has the most free memory or lowest die temperature or what have you... So I should apply rank for GPUs in such a way that it only comes into play when everything else about the job matches, i.e., no GPU required?

	-Michael Pelletier.

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Beyer, Christoph
Sent: Thursday, November 03, 2016 1:45 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] about START and RANK

Hi Jiang,

That's a sequential thing, the startd chooses a job that he prefers the most according to your rank expression.

Than before starting the job the startd evaluates the start expression to have a look if everything is ok to start the job.

For ex you could use the rank expression to prefer a multicore job and the start expression to only run it if the free disk space at start time is >200gb ...

best regards
        ~christoph