[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] condor_restart and missing machines (enhancement request)
- Date: Mon, 16 Jul 2007 09:13:37 +0100
- From: Ian Cottam <ian.cottam@xxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] condor_restart and missing machines (enhancement request)
I'm sure I've seen this behaviour too (Windows machines not reporting
in), as, sometimes, condor_q -g shows more jobs running than
condor_status does. Not sure what to do about it. I have just taken the
view of trying to grow our pool -- and those we can flock to -- so some
"drop outs" don't matter so much.
Kewley, J (John) wrote:
I am currently still epxeriencing problems with the reports from Windows PCs
failing to arrive at the central manager. I suspect this problem will go away
when they are all upgraded from 6.6 to 6.8, but in the meantime have this
If you do
condor_restart -master -all
all machines in your pool are sent a message to restart (subject to HOSTALLOW
settings of course).
Note that this sends request to machines that aren't currently reporting in.
What I would like is to be able to say:
condor_restart -master -MIA
which would restart all the ones which aren't currently reporting in (missing in action),
but leave the others alone.
Is there any other way at getting at this full list of machines?
I could do this myself if I had a
command, but I can't see how to do this without storing the information myself.
Another alternative is of course to use TCP for the heartbeats to try and prevent this.
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at:
Information Systems Manager
Manchester Interdisciplinary Biocentre
The John Garside Building (Room G.002)
The University of Manchester
t: 0161 306 5198
m: 07856 849831