[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] question about preemption policy



Sorry, but I am afraid I have not stated my question clearly then. We have
50 nodes, 4 vm's per node, that is, a pool of about 200 vm's. We want to
preempt jobs from a certain user (and only from this user) if they take 
more than 25% of the slots in the entire pool, that is, more than 50 vm's. 
These jobs allocate one vm per job and are likely to be cpu-intensive.   
Thanks,  Ilya

On Wed, 7 Jun 2006, Matt Hope wrote:

> On 6/6/06, Ilya Narsky <narsky@xxxxxxxxxxxxxxx> wrote:
> >
> > In our condor pool, we would like the negotiator to preempt user A's jobs
> > in favor of users with lower priority factors only if user A takes more
> > than 25% of the cpu's and leave user A's jobs running otherwise. Can this
> > be accomplished? What are the condor variables that can be used to specify
> > PREEMPTION_REQUIREMENTS? We are using condor 6.7.18.  Thanks, Ilya
> 
> A vm (in the current condor sense of the word) can only service one job at once.
> If you want (as I understand it) to have a job remain running if it is
> a low CPU intensive task and allow another one to run at the same time
> then you must have two VM's (or if you have a 2 way SMP machine 4 VM's
> etc.) where one VM is for low CPU jobs only and the other is for high
> CPU load ones*
> 
> The trick is in ensuring that the relevant jobs go to the proper
> places. If your jobs are very well segmented (one set will always be
> low CPU utilization) the others always high then you can achieve this
> if your users mark and direct their jobs appropriately, if not you
> have no way to control it.
> 
> On a side note, and not wanting to be teaching the sucking of eggs,
> this setup is often not the best for throughput. Since most tasks tend
> to be either CPU, memory, disk or network bound then even though the
> job seems to 'only' be taking 25% CPU time the other factor may well
> slow down the other more CPU intensive job more than you expect. Of
> course in a non checkpointing environment where you need less latency
> on those high CPU jobs it may be that the reduction in preemption
> costs outweighs the reduction in theoretical ideal throughput.
> 
> As far as condor goes if you wish to use preemption to reduce latency
> on certain jobs the best** way to reduce this cost is to get some form
> of checkpointing (even if done by hand via the signal mechanisms in
> the vanilla universe). Obviously this will sometimes just not be
> possible.
> 
> Matt
> 
> * Note that in an ideal world the second vm would be for _either_ job
> type but there are some nasty subtle behaviours that would make trying
> to achieve this tricky
> 
> ** By this I mean probably the best in real world benefit but, in some
> ways more importantly, the best way to leverage condor to avoid
> fighting with rather than working with it
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
>