[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] node gone from condor_status?



8.2.10 and 8.4 are pretty close together as versions,  I can't think if anything that would cause startds to go missing between them.

If its something about TCP vs. UDP updates, then a reconfig or restart of the node would make it's job ad show up in the Collector for a while (30 minutes or so) and then go away again.

The collector will log both the arrival and expiration of the machine.   If the machine is removing itself for some reason, then that will be logged using a different message than the expiration message.  Something like this.

03/07/17 16:33:37.850 (D_ALWAYS) StartdAd     : Inserting ** "< slot1@xxxxxxxxxxxxxxxxxx , 128.105.136.34 >"
03/07/17 16:33:37.851 (D_ALWAYS) StartdPvtAd  : Inserting ** "< slot1@xxxxxxxxxxxxxxxxxx , 128.105.136.34 >"
... etc ...

03/07/17 17:03:27.686 (D_ALWAYS) ÂÂÂCleaning StartdAds ...
03/07/17 17:03:27.686 (D_ALWAYS) ÂÂÂÂÂÂÂ**** Removing stale ad: "< slot1@xxxxxxxxxxxxxxxxxx , 128.105.136.34 >"
... etc ...
03/07/17 17:03:27.688 (D_ALWAYS) ÂÂÂCleaning StartdPrivateAds ...
03/07/17 17:03:27.688 (D_ALWAYS) ÂÂÂÂÂÂÂ**** Removing stale ad: "< slot1@xxxxxxxxxxxxxxxxxx , 128.105.136.34 >"
... etc ...

 03/08/17 09:19:13.028 (D_ALWAYS) Got INVALIDATE_STARTD_ADS
03/08/17 09:19:13.028 (D_ALWAYS)                **** Removed(1) ad(s): "< slot2@xxxxxxxxxxxxxxxxxx , 128.105.136.34 >"
03/08/17 09:19:13.028 (D_ALWAYS) (Invalidated 1 ads)
...etc...

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Dimitri Maziuk
Sent: Monday, March 6, 2017 4:55 PM
To: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] node gone from condor_status?

On 03/06/2017 04:52 PM, Dimitri Maziuk wrote:
> # rpm -q -a | grep condor
> condor-8.2.10-345812.x86_64

PS it's centos 6 (I expected "el6" in the package name)

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu