[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] RANDOM_INTEGER problems on Windows



Hi Ian,

In the context where you are using it, I'd expect $RANDOM_INTEGER() to be reevaluated every time the startd restarts or is told to reread the configuration (condor_reconfig, SIGHUP).

To make things explicit, you could put PERIODIC_VACATE into the startd ad and make your PREEMPT expression refer to it as a classad attribute rather than as a configuration macro. Then you can see the value with condor_status. Example of how to configure things that way:

PERIODIC_VACATE = ( ( $(REBOOT_TIME) - ClockMin ) == $RANDOM_INTEGER(0, 10) )
STARTD_EXPRS = $(STARTD_EXPRS) PERIODIC_VACATE
PREEMPT         = ($(UWCS_PREEMPT)) || PERIODIC_VACATE

I don't see why your policy is not working the way you want. Perhaps the above will help make it clear.

--Dan

On 11/30/12 5:25 AM, Smith, Ian wrote:
Hi Dan,

Thanks for the quick reply. Yes having the correct syntax certainly
helps ! I really should RTFM more carefully :-;

The strange thing is though that >this< expression never seems to evaluate
to TRUE (i.e. the jobs never get vacated).

PERIODIC_VACATE = ( ( $(REBOOT_TIME) - ClockMin ) == $RANDOM_INTEGER(0, 10) )
                                                      ^
If I run condor_config_val I see different integer values generated so the
big question is how often are the random values updated compared
with the ClockMin values ??? Obviously if it's just generated once on start
up then there's no problem but if the update periods are similar then
I could see why this would not work ...

Imagine for example that it is 10 minutes to reboot time and the just
a few integers are generated in the following minute: e.g 4, 8, 2, 3.
Then PERIODIC_VACATE doesn't evaluate to TRUE. By the same token on each
succeeding minute the integer needed for this to evaluate to TRUE
may also not be generated.

I'm sure there must be a way of expressing this so that PERIODIC_VACATE
evaluates to TRUE just once a day at a randomised time but I can't
see it at the moment.

any ideas ?

many thanks,

-ian.



-----Original Message-----
From: htcondor-users-bounces@xxxxxxxxxxx [mailto:htcondor-users-
bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
Sent: 29 November 2012 15:21
To: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] RANDOM_INTEGER problems on Windows

Hi Ian,

There should be a $ in front of RANDOM_INTEGER.  Does that help?

--Dan

On 11/29/12 6:18 AM, Smith, Ian wrote:
Hello All,

I'm trying to configure our execute hosts to vacate jobs
automatically
just before they are rebooted each night. To spread out the
checkpoints I've tried to add some "jitter" with RANDOM_INTEGER thus:

PERIODIC_VACATE = ( ( $(REBOOT_TIME) - ClockMin ) == RANDOM_INTEGER(
0, 10 ) )
PREEMPT         = $(UWCS_PREEMPT) || ( $(PERIODIC_VACATE) == TRUE )

but this does not seem to work. I can't track down a definitive error
message but it looks like the condor_startd (or possibly
condor_starter) is repeatedly failing and the shadow disconnecting
because of this.
If I take out the randomness, e.g.

PERIODIC_VACATE = ( ( $(REBOOT_TIME) - ClockMin ) == 0 )

everything works fine.

Has anyone else seen this ? Is RANDOM_INTEGER supported under Windows
or does it have some /dev/random dependence ?

I'm using Condor 7.6.2 on Windows 7 Enterprise.

regards,

-ian.

---------------------------------------
Dr Ian C. Smith,
Advanced Research Computing,
University of Liverpool, UK.
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/