[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Understanding user priority and job preemption





Jan Ploski wrote:
condor-users-bounces@xxxxxxxxxxx schrieb am 03/13/2007 07:38:30 PM:

I too am curious why you don't see the expected ratio of jobs running.

Here is one thing that may help in your condor configuration (on the nodes running startds).

CLAIM_WORKLIFE = 600

This prevents the schedd's claim to the startd from lasting indefinitely. Without this setting, the schedd will hold on to a claim as long as it has jobs to run on it (and as long as it doesn't get preempted).

Thanks for the reply. Unfortunately, I am afraid that CLAIM_WORKLIFE does not affect job preemption.

Today I analyzed my problem some more. In particular, I tested a variant without any nodes that don't match job requirements. That is, I tested with 20 rather than 60 total nodes.

As before, user A has priority 4 and user B has priority 8.

In the 20-node scenario, I can observe the following behavior:
1. If user A submits jobs first, taking all machines, and user B comes in later, then user B does not get any machines - A's jobs are never preempted. User B does not get machines even if I remove some running jobs of user A. In this case A's jobs are preferred,
no matter what.

This is the scenario in which CLAIM_WORKLIFE should decrease the amount of time it takes to balance out the share of the pool. Without limiting the lifespan of claims, it is expected that user A will retain 100% of the pool in the case you describe.

2. If user B submits jobs first, taking all machines, and user A comes in later, then B's jobs are preempted and the expected ratio of machines 1:2 becomes established.

Compare this with the 60-node scenario with 20 matching nodes, described in my previous message: 1. User A submits first, B comes later. The effect is the same as in case 1 above, B starves. 2. User B submits first, A comes later. Here, the expected 1:2 ratio does not set in. Instead, ALL 20 B-jobs are preempted and replaced with 20 A-jobs.

Based on these observations, I speculate that the following is true:
- Condor never preempts a running job of user A in favor of a job of user B when A.userprio < B.userprio, no matter what PREEMPTION_REQUIREMENTS is set to; this would explain the "insufficient priority" messages I see in NegotiatorLog - In the second scenario, Condor calculates A's pie slice as 2/3 * 60 = 40 nodes (rather than 2/3 * 20 = 13 matching nodes) and B's pie slice as 1/3 * 60 = 20 nodes (rather than 1/3 * 20 = 7 nodes). During negotiation Condor tries to satisfy A's contingent first because A.userprio < B.userprio. All 20 matching nodes are assigned to A because 20 < 40. Next, Condor tries to satisfy B's contingent, but does not find any nodes which match or are preemptible based on the first rule. Therefore, B gets nothing.

Can anyone confirm that the above reasoning is correct?

Yes.  You are correct.

If it is correct:
- Why is Condor assigning "pie slices" based on the total number of nodes in the pool rather than the total number of matching nodes?

The negotiator (as currently implemented) does not have a big list of all the jobs from all the users. It just has a list of submitters (i.e. users), and it only ever considers one job at a time when making matchmaking decisions.

- Is there any way to achieve the expected 1:2 ratio between two users competing for N specific machines of a pool with a total size of M >= 3*N?
If you know in advance that users A and B will only ever be able to run on N machines within your pool, then you could use group quotas to specify their share of the N machines. Here's more information on how to set that up:

http://www.cs.wisc.edu/condor/manual/v6.8/3_4User_Priorities.html#SECTION00446000000000000000

For the general case where N is very dynamic, depending on job requirements, I cannot think of a configuration solution.

I hope that helps.

--Dan