[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor_rooster failing to crow



Dear All,

I'm trying to use condor_rooster in Condor 7.4 to work with our Windows XP pool
but with only limited success. To keep comaptibility with our current power saving
set up I'm trying to avoid using the Condor power saving and intead I'm publishing
the ClassAds of offline machine via a cron so that condor_rooster can wake up
the relevant ones.

The crux of the matter seems to be in the UNHIBERNATE expression. In the documentation
(p 216) it states that the default value is MachineLastMatchTime =!= UNDEFINED although
I find that it is atually MY.MachineLastMatchTime =!= UNDEFINED. I've tried both and neither
seem to work as neither  MachineLastMatchTime nor  MY.MachineLastMatchTime seem
to be set. The manual says that 

"the special attribute MachineLastMatchTime is updated in the ClassAds of offline machines
when the job would have been matched to the machine if it had been online"

but this doesn't seem to be happening. Using condor_q -ana reveals

019.009:  Run analysis summary.  Of 1 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      1 match but are currently offline
      0 are available to run your job

so the matchmaking is definitely working - it just seems that the machine ClassAd isn't
updated. If I set MachineLastMatchTime to some arbitrary value myself then

ROOSTER_UNHIBERNATE=Offline && Unhibernate

seems to evaluate to TRUE and the wake up kicks in.

I've tried D_FULLBEBUG but I still can't track down where the problem is.

Any ideas ?

regards,

-ian.


--------------------------------------------
Dr Ian C. Smith,
e-Science Team,
The University of Liverpool,
Computing Services Departmen