[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Windows wake-on-lan / "green" computing environments query




On Mar 31, 2009, at 5:05 PM, Rob wrote:

Say, you have a pool of 1000 PCs and there is a submitted job that cannot find a resource to run on, possibly because Condor has sent too many PCs into hibernation.
You should then decide to wake up an appropriate PC.
How can you decide which PC in hibernation is the right one?

Condor can help if it has stored the ClassAds of the hibernated PCs, before hibernating them.

Therefore, if Condor remembers the ClassAds of PCs it has hibernated, it can then also quickly decide which hibernated PC should be woken up by WOL to run the job.

That would obviously be the best situation. I get the impression that it's a planned feature, but it's not implemented in 7.2. Obviously an external script isn't going to have all the information it needs to wake up the "best" machine. However, our cluster is fairly homogenous, so I think I'm going to write a fairly naive script that just wakes up machines based on the number of idle jobs in the queue.

[ A practical problem here: immediately after start-up, a PC is in Owner state ] [ and thus it takes a while before it becomes available for running jobs.]


You could get around this by using suspend-to-RAM or suspend-to-disk instead of actually shutting the machine down. There's support for that in Condor already, but it won't work in my case because my machines won't respond to wake-on-LAN when they're hibernating, only when they're off.

--

David Brodbeck
System Administrator, Linguistics
University of Washington