[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Whole System Scheduling



would a simple poll testing the cpu load which returns once n consecutive tests under x % total load occur be a sufficient and simple check to be getting on with?
A limit to the number of iterations would avoid issues with machines which are loaded for some other reason.

If memory usage is the problem checking that instead might be better.

Matt

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jonathan D. Proulx
Sent: 26 October 2009 18:21
To: Condor-Users Mail List
Subject: Re: [Condor-users] Whole System Scheduling


So to distill my troubles...

Using Dan's whole system recipe whith the whole-system job suspending
untill the system clears:

Basicly works unless it takes longer than
MaxSuspendTime for the other slots to clear.  At the end of
MaxSuspendTime I'm seeing overlapping runtime of anywhere from a
couple of seconds, OK I guess, to 10 minutes which is not.  How long
shoudl you wait for resources to free up is a question for the ages,
but once you decide not to seems like you should go right to Vacating
do not pass Busy


Attempting to kick everything out when a whole-system job starts,
avoids the question of waiting:

SUSPEND = (SlotID =!= 1 && Slot1_RequiresWholeMachine =?= True) 
PREEMPT = (SlotID =!= 1 && Slot1_RequiresWholeMachine =?= True )

Seems simple enough if rude, but again I'm seeing one to three minutes
of overlapping runtime before the other slots clear out. 

While I'd like it to be shorter my main problem is the indeterminance,
is there any way on the execute site to see what slots are busy so I
could write a wrapper that waits until they are free?

Thanks,
-Jon
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/

----
Gloucester Research Limited believes the information provided herein is reliable. While every care has been taken to ensure accuracy, the information is furnished to the recipients with no warranty as to the completeness and accuracy of its contents and on condition that any errors or omissions shall not be made the basis for any claim, demand or cause for action.
The information in this email is intended only for the named recipient.  If you are not the intended recipient please notify us immediately and do not copy, distribute or take action based on this e-mail.
All messages sent to and from this email address will be logged by Gloucester Research Ltd and are subject to archival storage, monitoring, review and disclosure.
Gloucester Research Limited, 5th Floor, Whittington House, 19-30 Alfred Place, London WC1E 7EA.
Gloucester Research Limited is a company registered in England and Wales with company number 04267560.
----