[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[condor-users] Windows Condor silently hanging after a day or so...



So,

I've got a small (4 node) demo installation of running Condor on XP and W2K
boxes. But I'm noticing that frequently when I come in in the morning,
condor_status has only one or two machines in it. The only way to restore
the other machines is to go to each one and perform a "net stop condor"
followed by a "net start condor".

The master log for all the machines contain many records like these:

11/8 13:38:19 DaemonCore: Command received via UDP from host
<54.14.48.190:4813>
11/8 13:38:19 DaemonCore: received command 60014 (DC_INVALIDATE_KEY),
calling handler (handle_invalidate_key())
11/8 16:13:19 DaemonCore: Command received via UDP from host
<54.14.48.190:1120>
11/8 16:13:19 DaemonCore: received command 60014 (DC_INVALIDATE_KEY),
calling handler (handle_invalidate_key())
11/8 23:48:23 DaemonCore: Command received via UDP from host
<54.14.48.190:1887>
11/8 23:48:23 DaemonCore: received command 60014 (DC_INVALIDATE_KEY),
calling handler (handle_invalidate_key())

Are these messages relevant to the problem? What do I need to change to keep
my grid running overnight?

Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>