[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Collector house cleaning activities?

We have also seen this behaviour on previous versions (6.6.4) on windows, similarly a reconfig or even a condor_status directly to the vm affected seems to fix it.   I haven't got examples from the vm starter logs but if other people are seeing this too I will try and catch it and send it as a possible bug


-----Original Message-----
From: "Ian Chesal" <ICHESAL@xxxxxxxxxx>
Date: Thu, 16 Sep 2004 07:31:09 
To:"Condor-Users Mail List" <condor-users@xxxxxxxxxxx>
Subject: RE: [Condor-users] Collector house cleaning activities?

I'm also seeing this problem with dual processor windows 2k clients in
my pool -- the condor_status command will report only one of the two
vm's on a machine. Running condor_reconfig <host> seems to fix the
problem. This is with 6.7.1.


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of
Sent: September 16, 2004 8:49 AM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] Collector house cleaning activities?


I have a pool of 18 dedicated 4-way execute nodes.  My problem is that
condor_status sporadically reports less than the expected 72 nodes.  If
I use condor_status  -direct NODENAME everything looks fine.  I've
turned on D_FULLDEBUG on the collector and have not seen anything
unexpected except for "Removing stale ads for vm?@NODENAME".  I've
turned on D_FULLDEBUG for the startd on the affected nodes and, again,
nothing seems to be in error. Another wrinkle to this problem is that it
seems to be only affecting the execute nodes on the same switch as the
central manager.

For example, a condor_status would report that I have vm2, vm3, vm4 but
no vm1.  If I start a run with condor_status reporting 71 nodes, that's
all I'll get.  If I loop through all nodes with a condor_restart
-startd before the classad lifetime expires I'll be OK.   I have
this before but without this level of detail.  I know that this might be
a problem on my end but I'm at my wits end.

Any help would be greatly appreciated.

Thank you,

Bob Nordlund

PRIVILEGED AND CONFIDENTIAL: This communication, including attachments,
is for the exclusive use of addressee and may contain proprietary,
confidential and/or privileged information.  If you are not the intended
recipient, any use, copying, disclosure, dissemination or distribution
is strictly prohibited.  If you are not the intended recipient, please
notify the sender immediately by return e-mail, delete this
communication and destroy all copies.

Condor-users mailing list
Condor-users mailing list