[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor view reporting strange errors



On Tue, Aug 01, 2006 at 04:21:41PM +0100, Dr Ian C. Smith wrote:
> Hi,
> 
> I've recently been getting error reports from our condor view
> server of the form
> 
> "/opt/condor/sbin/condor_collector" on "ulgp1.liv.ac.uk" died due to signal 
> 11.
> Condor will automatically restart this process in 10 seconds.
> 
> *** Last 20 line(s) of file CollectorLog:
> 8/1 15:58:53 		Error while removing ad
> 8/1 15:58:53 		**** Removing stale ad: "< ulgbc1.liv.ac.uk , 38.253.100.129 
> >"
> 8/1 15:58:53 		Error while removing ad
> 8/1 15:58:53 		**** Removing stale ad: "< ulgbc2.liv.ac.uk , 38.253.100.82 
> >"
> 8/1 15:58:53 		Error while removing ad
> 
> These machines aren't "real" execute hosts but represent clusters which can
> be reached using Condor-G. The classads for them are generated by a cron 
> from
> Globus MDS info. The classads are updated every 5 minutes. Things seemed ok
> until I added StartdIpAddr attribute in the classads in response to another
> error. Could this have something to do with it ?
> 

Yes. Upgrade to 6.8.0, which fixes some bugs with the StartdIpAddr handling
in the collector.

-Erik