[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Suspend and resume jobs by on demand



Hi all
 
I was wondering if someone has some experience/suggestion for this following setup. We have Windows machines so checkpointing is not supported. Preemption is off because we don't want loose running progress. Is there a way to suspend running jobs (usually takes days) to run newly submitted jobs (usually takes mins/hours) and to resume suspended jobs once these short jobs finish?
 
I was thinking that I could set NUM_CPUS to double the actual number of CPUs. Set STARTD policy in a way that when half of CPUs is running a job, the other half can't match to a job. When short jobs comes, either identified by accounting groups or a config variable, suspend running jobs and run short jobs on the other half of CPUs. Is this configuration feasible?
 
Thanks
Rick
 

Conexant E-mail Firewall (Conexant.Com) made the following annotations
---------------------------------------------------------------------
********************** Legal Disclaimer **************************** "This email may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message in error, please advise the sender by reply email and delete the message. Thank you." ********************************************************************** ---------------------------------------------------------------------