[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor pool nodes using WOL



On 12/15/05, Philipp Kolmann <kolmann@xxxxxxxxxxxxxxxx> wrote:
> Hi,
>
> David Wallom schrieb:
> > In order to satisfy the dual needs of a department we are trying to
> > instigate a method of using Wake On Lan with Condor so that a small
> > number of nodes stay up constantly to satisfy instantaneous demand
> > but the rest of them are sleep able to be woken on demand... Has
> > anyone else tried this? It would of course save massively on
> > electricity etc.
>
> I have realised that. I have made a small script that I call via cron to
> check if there are any jobs not running in the queue and in case there
> are jobs in need, I issue a WOL call.
<snip>
>
> #!/bin/sh
>
> # Copyright Philipp Kolmann, 2005
> # kolmann@xxxxxxxxxxxxxxxx
>
> # Wakes machines if there are condor jobs waiting.
> # MacAddress List comes from lrb1. Possibility to specify a special host on the
> # commandline
>
> condor_q -xml | grep JobStatus
> if [ "$?" = "1" ]
> then
>  exit 0
> fi

<snip>

as an alternate, especially if you need to run this againts multiple
condor schedds,  should this script require relatively high frequency
execution I suggest using

condor_status -schedd -format "%d\n" TotalIdleJobs

Then making your script parse this appropriately is no problem and the
performance hit to the farm is negligble. Adding constrinats to avoid
querying machines which aren't allowed to use the execute machines
becomes simple with -constraint options.

In general no automated tools should really try to 'scrape' info from
the schedd's (via condor_q) unless there is no other option.
Comunicate with the collector instead it is faster and has far less
impact to your pool.

Matt