[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_collector crash every 15min, ERROR "Assertion ERROR on (ip_string)"



Hi Hongwei,

https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=5762,56

While not exactly the same, this ticket shows a very similar issue that
is fixed in the upcoming 8.4.9. Could this be what you are encountering?

If you put the relevant parts of your configuration I or someone else
may be able to help more.

Thanks,
Will

On 08/09/2016 10:13 AM, Ke, Hongwei wrote:
> Hello,
> 
> We have recently setup a condor cluster with a few machines in a isolated network, which has no internet access. IP addresses are assigned by the interface configuration file and hosts can be resolved by hosts file. Everything works very well almost out of the box, but the condor_collector crashes every 15min with the following messages. Has anyone seen this before? Thank you very much!
> 
> =======================================================================================================
> *** Last 20 line(s) of file /var/log/condor/CollectorLog:
> 08/03/16 13:14:54 Query info: matched=227; skipped=8; query_time=0.002803; send_time=0.050182; type=Any; requirements={( ( ( MyType == "Scheduler" ) || ( MyType == "Submitter" ) ) || ( ( MyType == "Machine" ) ) )}; peer=<172.17.27.4:13756>; projection={}
> 08/03/16 13:15:03 Housekeeper:  Ready to clean old ads
> 08/03/16 13:15:03 	Cleaning StartdAds ...
> 08/03/16 13:15:03 	Cleaning StartdPrivateAds ...
> 08/03/16 13:15:03 	Cleaning ScheddAds ...
> 08/03/16 13:15:03 	Cleaning SubmittorAds ...
> 08/03/16 13:15:03 	Cleaning LicenseAds ...
> 08/03/16 13:15:03 	Cleaning MasterAds ...
> 08/03/16 13:15:03 	Cleaning CkptServerAds ...
> 08/03/16 13:15:03 	Cleaning CollectorAds ...
> 08/03/16 13:15:03 	Cleaning StorageAds ...
> 08/03/16 13:15:03 	Cleaning NegotiatorAds ...
> 08/03/16 13:15:03 	Cleaning HadAds ...
> 08/03/16 13:15:03 	Cleaning GridAds ...
> 08/03/16 13:15:03 	Cleaning XferServiceAds ...
> 08/03/16 13:15:03 	Cleaning LeaseManagerAds ...
> 08/03/16 13:15:03 	Cleaning Generic Ads ...
> 08/03/16 13:15:03 Housekeeper:  Done cleaning
> 08/03/16 13:15:04 SafeSock::my_ip_str() failed to connect, errno = 101
> 08/03/16 13:15:04 ERROR "Assertion ERROR on (ip_string)" at line 457 in file /slots/10/dir_4051338/userdir/.tmpQ815l4/BUILD/condor-8.4.8/src/condor_utils/condor_sockaddr.cpp
> *** End of file CollectorLog
> =======================================================================================================
> 
> Best regards,
> 
> Hongwei Ke
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>