[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] split processors among multiple jobs



that actually helps a lot... thank you...

if you can bear with me for a minute..

If i would put a group for each machine, like the first machine would be group1, second machine group2, etc..

Then I guess I could query before i submit a job, how many other jobs are running, what rank groups they prefer, and then submit the job with a different rank. That way, you could load balance 20 jobs easily.

The only part where that might get tricky, is if you submit 5 jobs,
having ranks from group 1 to 5, (but the whole network has actually 30 groups (1 for each proc)), will they occupy the remaining groups equally, or will job 1 get the rest of the 25 processor "groups"


and then, if you submit another 5 jobs (while the other 5 are still running)..what happends then ?

Ian Chesal wrote:




is there any way, that if i submit 2 jobs at the same time (each with 30 queues), so 60 jobs total, and i have 30 processors, that instead of processing job 1, wait till it finishes and then processes job 2, i want it to split the 30 processors equally and give 15 processors to job 1, and 15 processors to job 2 right from the start ?

any way to do this ?



The default behaviour of the schedd, given all things are equal, is to perform FIFO execution of your clusters. So the schedd wants to run cluster 1 than cluster 2. But, you can steer the jobs in a cluster towards particular machines with the RANK expression in your submission ticket. Or, absolutly require jobs from a cluster to run on a specific set of machines using the REQUIREMENTS expression. You could also submit as two users, one for each cluster, and let the negoitator load balance automatically for you.

If you go with the RANK expression approach you're willing to accept
some unequal sharing between the clusters, but at the gain of
potentially better resource utilization (especially if your clusters are
not the exact same set of jobs). Assuming you have half your machines
tagged with the classad attribute MyMachineGroup = 1 and the other half
has MyMachineGroup = 2. You could do this with machine names, but it
will be a rather large RANK expression. In your first cluster's ticket
file you put:

RANK = MyMachineGroup =?= 1

and in your second cluster's ticket file you put:

RANK = MyMachineGroup =?= 2

Now cluster 1 prefers machines in the group labeled "1" and cluster 2
prefers machines in the group labeled "2". This is a less restrictive
way to steer your jobs towards preferred machines. If one cluster
finished before the other, the other cluster's idle jobs could occupy
the less preferred half of machines so you get perference without under
utilization.

To make it a hard separation, with no chance that a job from cluster 1
would run on machines preferred by jobs from cluster 2 simply change
RANK to REQUIREMENTS in the above example. Now you're saying "you can
only run on machines where MyMachineGroup is defined to be 1" (or 2 for
cluster 2). Great for absolute separation, but lousy for usage if one
cluster finished before the other because the remaining idle jobs could
never move to the empty machines.

Hope that helps.

- Ian