[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] State Change Delay - Hawkeye



Hi Nick:

I do not want the output of the Hawkeye job updated on the Collector
every time the job is run (since the output of the Hawkeye job is
guaranteed to change every time the job is run, IF_CHANGED amounts to
ALWAYS in my case). With a couple hundred to a thousand machines, this
is probably more network traffic than our sys admins are willing to handle.

I only need to update the collector with the changed ClassAds every 5
minutes or so - my issue is that the local startd only seems to evaluate
the CONTINUE/WANT_SUSPEND/SUSPEND expressions correctly after performing
a collector update.

Shouldn't the startd be able to process the variable/ClassAd changes due
to the Hawkeye job locally (e.g., without a Collector update)? If the
startd changes states or the "5 minute timer" expires, then I am willing
to let a collector update occur.

David

On 06/21/2010 02:18 PM, Nick LeRoy wrote:
>> Hello Condor Gurus:
> Hello,
>  
>> I am experience some delays (seems like up to 5 minutes) moving between
>> states (Busy -> Suspended, Suspended -> Busy) when using
>> CONTINUE/WANT_SUSPEND/SUSPEND expressions that contain variables defined
>> in a Condor Hawkeye job.
>>
>> The Condor Hawkeye job runs every 15 seconds. The output of the job
>> contains a definition (POVB_HostOsKeyboardIdle). The relevant
>> configuration settings are below -
>> STARTD_CRON_JOBLIST = UPDATESTATS
>> STARTD_CRON_UPDATESTATS_PREFIX = POVB_
>> STARTD_CRON_AUTOPUBLISH = NEVER
>> STARTD_CRON_UPDATESTATS_EXECUTABLE =
>> $(POVB_RELEASE_DIR)/bin/povb_hawkeye_stats
>> STARTD_CRON_UPDATESTATS_PERIOD = 15s
>> STARTD_CRON_UPDATESTATS_MODE = Periodic
>> STARTD_CRON_UPDATESTATS_KILL = True
> 
> Why do you have AUTOPUBLISH set to NEVER?  If you want to see the state pushed 
> to the collector every time it's changed, set AUTOPUBLISH=IF_CHANGED.
> 
> -Nick
> 
>> The local configuration file also contains
>> POVB_HostOsKeyboardIdle = 0
>> KeyboardBusy = (POVB_HostOsKeyboardIdle < 5 * $(MINUTE))
>> ConsoleBusy = (POVB_HostOsKeyboardIdle < 5 * $(MINUTE))
>>
>> The first line is there to make sure that KeyboardBusy and ConsoleBusy
>> do not evaluate to UNDEFINED before Condor runs the Hawkeye job for the
>> first time, but the value of POVB_HostOsKeyboardIdle should be redefined
>> everytime the Hawkeye job runs.
>>
>> I can watch the output of the Hawkeye job and can mentally evaluate the
>> WANT_SUSPEND/SUSPEND expressions - but it takes Condor up to 5 minutes
>> after these expressions should go to TRUE to change the state. The
>> POLLING_INTERVAL is set to be 5 seconds - with the debug flags on, I can
>> see Condor performing these updates every 5 seconds, but the state does
>> not change. It is only when I see <eval_and_update_all> (which occurs
>> every 5 minutes) that the state finally changes.
>>
>> I assume the frequency of <eval_and_update_all> is related to the
>> Collector update time (UPDATE_INTERVAL). Is this correct? I certainly do
>> not want to update the collector every 5-15 seconds, just move between
>> the states.
>>
>> Am I missing anything?
>>
>> Thanks for all of your help and assistance. It is greatly appreciated!
>> David
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>