[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Windows XP & 2000 GUI Crashes due to Condor



Hi John,

We are having the same problem here at Altera. The response that I
received from the Condor development team is that the problem should be
fixed in 6.6.6, but the fixes have not been fully merged back to the 6.7
series. I was told that 6.7.2 should contain most of them. 

This is a very serious problem. Nobody would be willing to join their
machines to the Condor pool if they keep seeing their application
windows got killed one by one...

Jimmy

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of John Wheez
Sent: Tuesday, October 05, 2004 9:28 PM
To: Condor-Users Mail List
Subject: [Condor-users] Windows XP & 2000 GUI Crashes due to Condor

Hi,

I've noticed this alot in many versions of Condor. I use Condor linux 
6.7.1 as my master & condor 6.7.1 windows XP on all client machines.

Sometimes when i am using a machine which is executing background condor

tasks, my windows GUI and all programs and windows are shut down.
Then the gui dissapears..a few seconds later the windows gui comes back 
but all programs that were running were killed. Also when this happens, 
condor_status and condor_q are unavailable on the linux condor 
master...its as if condor on the master crashes and some how it crashes 
all the client machines too. Awesome! but this is probably not the 
behaviour that is desired.


When this happens ALL condor processes on the Windows machines are 
killed so i need to manually restart them or reboot the machines. On the

master linux server after a minute or two condor_q produces and 
condor_status produce output. But incorrect output. Condor_status shows 
that the windows clients are busy computing but infact they are not 
because the applications were killed and condor is not even running on 
those machines anymore. Windows Task Manager shows the computer as 100% 
idle. Odd.

I suspect the Schedd deamon has something to do with this because when i

try to shut down the binaries linux waitys a while and says the processs

is "Disfunct"

 To get condor running again, I have to shutdown all the processes on my

master and restart them on my master and all clients...funn fun.

My clients were not out of memory or diskspace either...no where close.

anyone else notice this behaviour?

JW


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users