[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] range of LoadAvg.../ config issues....!!!!



hi...

i have condor on two test machines, a manager and client.

i'm trying to setup a basic system, where an app will fire on the submitted
machine, and then if the cpu is at a certain level, condor will fire the
apps on the other machine, giving me a kind of basic load balancing
environment...

i have the following in my config files:
=================================
BackgroundLoad      = 0.7

CPUIdle      = ($(NonCondorLoadAvg) <= $(BackgroundLoad))
CPUBusy      = ($(NonCondorLoadAvg) >= $(HighLoad))

#bdouglas test params
WANT_SUSPEND        = FALSE
WANT_VACATE  = FALSE
START        = $(Test_Start)
SUSPEND      = FALSE
CONTINUE     = TRUE
PREEMPT      = FALSE
KILL         = FALSE
PERIODIC_CHECKPOINT = FALSE
PREEMPTION_REQUIREMENTS = FALSE
PREEMPTION_RANK     = 0

#####################################################################
## bdouglas - test netowrk attribs
#####################################################################
Test_Start    = (CPUIdle)

=================================

however, condor seems to be in a continual wait state, as nothing is
happening!! if i replace the
START        = $(Test_Start)
with
START        = TRUE

everything seems to run on the submitted machine, without being run to the
2nd machine.....

any ideas as to what i can do to accomplish my goal, or any insight into
what i'm overlooking....

thanks

-bruce


-----Original Message-----
From: Nathan Mueller [mailto:nmueller@xxxxxxxxxxx]
Sent: Saturday, October 09, 2004 7:09 AM
To: bruce
Cc: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] range of LoadAvg...


> given that LoadAvg is a float.. am i correct in assuming that this
> attribute has a range between 0-1.0....

No, load average has no max value. The load is the average number of
processes running (or in disk wait) over some period of time. It's
pretty common to see loads over 10.

> if i'm wrong, can someone shed any light as to the range of this
> attribute. i'm considering using this as a way to essentially select a
> given machine within my network. ie, start on machine if the 'loadavg
> < 0.5'...

That's a pretty reasonable value. At the UW we don't start a job if the
load is greater then .3. This is also the default in the condor config.
That's to provide a fair number of cycles for jobs as well as to not
disrupt high load interactive jobs. If you want to run on machines with
the lowest load possible you'd be better off setting the job's rank in
your submit file to something like (1 / LoadAvg).

        --Nate