[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] dynamic slots - over subscription



Thanks Matt for your reply (and other replies).

We are running in a Linux environment and the only way to get the
offending user is to log into the box and see who is running. I wish
there was an easier way to accomplish this. I though the whole point
of dynamic slots is so people won't over subscribe for resources.

I suppose I can always do this, right?

START = $(LoadAvg) < $(NUM_CPUS)

This should prevent users to have any more jobs being submitted here.




On Wed, Jan 13, 2010 at 7:17 AM, Matt Hope <Matt.Hope@xxxxxxxxxxxxxxx> wrote:
> enforce their choice :)
>
> Set the cpu affinity of the jobs to match their request (tricky with dynamic partitions since you need to work out a way to dynamically partition the mask (having condor do this for you would be ideal).
>
> On windows I'd write a little wrapper that asked a local service for a mask (and that, on being asked checked for the liveness of all the currently active masks to see which, if any had departed to free up nodes in the mask). User job wrappers to ensure your cpu affinity is always applied on start up.
>
> If any one tries to work around this either a) stamp on them hard, b) move to job objects and clamp their memory usage as well.
>
> You could try to make this fancy (by trying to be NUMA friendly where possible).
>
> By effectively penalising people that do it wrong you would likely find that people moved towards getting it right.
>
> Matt
>
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Mag Gam
> Sent: 13 January 2010 11:41
> To: Condor-Users Mail List
> Subject: [Condor-users] dynamic slots - over subscription
>
> Lets say I have a process which takes up 4 CPus, and I have 10x16 core
> servers with 64G of memory.
>
> To get more of my jobs to run I do:
> #This should be 4
> RequestCpus = 1
>
> How can I prevent users to do this?  This is clearly create extra load
> on the servers.
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>
> ----
> Gloucester Research Limited believes the information provided herein is reliable. While every care has been taken to ensure accuracy, the information is furnished to the recipients with no warranty as to the completeness and accuracy of its contents and on condition that any errors or omissions shall not be made the basis for any claim, demand or cause for action.
> The information in this email is intended only for the named recipient.  If you are not the intended recipient please notify us immediately and do not copy, distribute or take action based on this e-mail.
> All messages sent to and from this email address will be logged by Gloucester Research Ltd and are subject to archival storage, monitoring, review and disclosure.
> Gloucester Research Limited, 5th Floor, Whittington House, 19-30 Alfred Place, London WC1E 7EA.
> Gloucester Research Limited is a company registered in England and Wales with company number 04267560.
> ----
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>