I've fixed the the problem; it's working now. Mainly, there were two things went wrong but none was reported correctly in the log.
First, I have a wrong entry in the condor_config file:
FLOCK_TO = FALSE
FLOCK_TO/FROM is not a boolean type and if it's set to anything other than "empty", that assumed as a hostname and will try to resolve the name. That's why I was getting:
IPVERIFY: unable to resolve IP address of FALSE /lib64/tls/libc.so.6(__nss_hostname_digits_dots+0x47)[0x33394d7a87]
Once corrected, those errors were gone.
But the actual problem came from a confusing statement in the supplied condor_config file, which caused all of these trouble.
Line #961, in the condor_config file ($CONDOR_CONFIG), it says:
So, I kept the NEGOTIATOR_HOST line commented out. It turned out to be a bug and actually reported by our colleague at Cambridge eScience, which is being fixed/tested in v7.5. Until then, we do need to set:## The NEGOTIATOR_HOST parameter has been deprecated.
just like old days. Once I set this up, condor_schedd stopped crashing and everything came back in life. If any one else is also having the similar problem, it's worth checking.NEGOTIATOR_HOST = $(CONDOR_HOST)
On 18/11/2010 18:39, Ines Dutra wrote:
Dear all, regarding this problem mentioned by Santanu, I am having the same one here in the Biostats and Med Informatics Department. Besides having these error messages I also have messages such as: