[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] [Condor 7.6.0] User activity detection woes under W7 Vista



Thanks, we’ll be testing the new version ASAP.

 

----
Fabrice Bouyé (http://fabricebouye.cv.fm/)
Fisheries IT Specialist
Tel: +687 26 20 00 (Ext 411)
Oceanic Fisheries, Pacific Community
http://www.spc.int/

 

From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Ian Chesal
Sent: Thursday, June 02, 2011 1:44 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] [Condor 7.6.0] User activity detection woes under W7 Vista

 

There is a bug in the 7.6.0 condor_kbdd daemon -- on resume the listen port of the condor_startd can change and often does. If the listen port of the condor_startd daemon changes, existing condor_kbdd instances silently starts to fail to send updates on keyboard and mouse activity to the startd. In this case you'll see your policies that normally rely on keyboard idle and mouse idle timers start to fail.

 

It's fixed in 7.6.1.

 

Regards,

- Ian


-- 
Ian Chesal
ichesal@xxxxxxxxxxxxxxxxxx

http://www.cyclecomputing.com/

 

On Wednesday, June 1, 2011 at 12:24 AM, Fabrice Bouye wrote:

Hi,

thanks for the info but that’s not what’s written in the manual:

 

http://www.cs.wisc.edu/condor/manual/v7.6/10_Appendix_A.html#sec:Machine-ClassAd-Attributes

State:

String which publishes the machine's Condor state. Can be:

"Owner":

The machine owner is using the machine, and it is unavailable to Condor.

"Unclaimed":

The machine is available to run Condor jobs, but a good match is either not available or not yet found.

 

 

What we’ve seen over the recent weeks is that jobs run even on PC which already in use.

When someone comes back on a computer that was unattended, job is not evicted and keeps running despite the rule being to stay in RAM for 10 min before being evicted.

We have yet to start large scale jobs on the new 7.6.0 nodes and we will definitely look at their behavior early next week (the rest of the week is closed here).

 

Note: we use the default UWCS rules on most computers.

 

keyboardIdle on the W7 64 PC I am just typing this email is:

KeyboardIdle = 25208

Which I guess is wrong…

 

The same query from an XP machine returns 0.

 

----
Fabrice Bouyé (http://fabricebouye.cv.fm/)
Fisheries IT Specialist
Tel: +687 26 20 00 (Ext 411)
Oceanic Fisheries, Pacific Community
http://www.spc.int/

 

From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg.Hitchen@xxxxxxxx
Sent: Wednesday, June 01, 2011 2:40 PM
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] [Condor 7.6.0] User activity detection woes under W7 & Vista

 

A node being "Unclaimed" has nothing to do with keyboard/mouse activity. It is merely unclaimed

in the context of Conod, i.e. no condor process (job) has claimed that node in order to run a job.

 

You need to check the KeyboardIdle parameter in the machine classad.

 

e.g. condor_status -l machinename | findstr KeyboardIdle

 

This will show the number of seconds since the last keyboard/mouse activity

 

A machine can sometimes appear as Owner when the non-condor CPU load is high enough

 

condor_status -l machinename | findstr Cpu

 

will show the Cpu related classads.

 

Have you tested a node by sending a job to it, then sitting at it and moving/clicking the mouse

and/or keyboard and seeing if the job is evicted? (providing you have set your configs that way).

 

Cheers

 

Greg


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Fabrice Bouye
Sent: Wednesday, 1 June 2011 10:20 AM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] [Condor 7.6.0] User activity detection woes under W7 & Vista

Hi,

We’ve had issues with condor 7.4.2 & 7.4.4 not detecting correctly user activity on Windows 7 in the past but we tried to live with it as, at that time, we had only a few computers with this OS in our pool and most of the time their users were the ones wanting to run tasks on condor.

 

However this year, due to an increase of both newcomers and replacement of old computers in our programme, we’ve reach the point where W7 computers make between 1/3 and 50% of all nodes in the pool. So condor loosing the presence of the user and starting heavy tasks around is now a more pressing issue than what it used to be.

 

In order to try fix that, I’ve installed condor 7.6.0 on some test PCs, made some quick fixes to our local config file and those nodes were ready to fly. I’ve also had a look to the “Upgrading from the 7.4 series to the 7.6 series of Condor” manual page (http://www.cs.wisc.edu/condor/manual/v7.6/8_2Upgrading_from.html) and, as a consequence, I’ve added KBDD to the DAEMON_LIST variable in the local configuration file for each testing node.

I’ve checked in the logs that kbdd starts OK and condor_kbdd.exe is reported by the task manager as running.

 

However, this does not seem to change anything as the computer still listed as Unclaimed most of the time. Sometimes for unknown reasons the computer is marked at Owner for 15min~1h before quickly going back to unclaimed again. But most of the time it looks like my mouse & keyboard activity just goes unnoticed.

 

From what I’ve experienced on one of my test laptops, mouse and keyboard activity detection on Windows Vista seems to be sketchy as well.

 

Are there any plans to fix finally this issue in a future release? Or is there a working workaround available?

 

Thanks.

 

----
Fabrice Bouyé (http://fabricebouye.cv.fm/)
Fisheries IT Specialist
Tel: +687 26 20 00 (Ext 411)
Oceanic Fisheries, Pacific Community
http://www.spc.int/

 

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/