[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Changing the time interval for matching




On Feb 15, 2011, at 16:49 , David Brodbeck wrote:



On Mon, Feb 14, 2011 at 7:23 AM, Stephen McGough <stephen.mcgough@xxxxxxxxxxxxxxx > wrote:
Dear All,

We have successfully set up a cluster which uses a non Condor method for sleeping Windows 7 computers and uses Rooster to wake up these computers when jobs are waiting. However, we now have a race condition. The University here have a tight requirement for shutting down computers "out of hours" and if no Condor job starts within 5 minutes the computer will be powered down. We have also seen that it can take Condor 5 minutes to match and start a job on a computer once Rooster has woken it up. At the moment we're seeing a number of jobs which wake a computer up, fail to start in 5 mins so the computer goes to sleep, just to wake up another computer. All jobs now run but it would be good to remove these unwanted wake-ups.

To do this we would like to reduce the amount of time Condor takes to match (we're trying to extend the time interval before a computer sleeps too). The START expression evaluates directly to true "out of hours".

You might try tweaking NEGOTIATOR_INTERVAL. The default is 300 seconds before 7.4.0, 60 seconds after. At our site we set it to 30 to reduce the amount of time jobs would spend sitting in the queue; we have a small cluster so the extra load on the central manager has not been a problem (or even noticeable, really.)

Also look at the SCHEDD_INTERVAL You can decrease that as well, although a SUBMIT event should trigger a new SCHEDD cycle to start, unless you've tweaked that setting.

-Peter