[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] condor view reporting strange errors



Hi,

I've recently been getting error reports from our condor view
server of the form

"/opt/condor/sbin/condor_collector" on "ulgp1.liv.ac.uk" died due to signal 11.
Condor will automatically restart this process in 10 seconds.

*** Last 20 line(s) of file CollectorLog:
8/1 15:58:53 		Error while removing ad
8/1 15:58:53 **** Removing stale ad: "< ulgbc1.liv.ac.uk , 38.253.100.129
"
8/1 15:58:53 		Error while removing ad
8/1 15:58:53 **** Removing stale ad: "< ulgbc2.liv.ac.uk , 38.253.100.82
"
8/1 15:58:53 		Error while removing ad

These machines aren't "real" execute hosts but represent clusters which can
be reached using Condor-G. The classads for them are generated by a cron from
Globus MDS info. The classads are updated every 5 minutes. Things seemed ok
until I added StartdIpAddr attribute in the classads in response to another
error. Could this have something to do with it ?

I'm completely flumoxed by this as everything has been working fine with
condor view for many months. Any help would be very useful !

regards,

-ian.

-----------------------------------
Dr Ian C. Smith,
e-Science team,
University of Liverpool
Computing Services Department