[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Collector error - ERROR: receiving new UDP message but found a long message still waiting to be closed




Upgrading to 7.0.0 should make this error message go away. However, I wouldn't recommend upgrading for this reason alone, because 6.8.8 automatically recovers from this unexpected state, and the specific case of it that you are encountering has a known cause that is benign. Versions of Condor prior to 6.8.8 did not emit this error message, but only because the error went undetected. The only known case where the undetected error could cause serious problems is when submitting jobs to a 6.9.3 or later schedd to be run on 6.8.7 and earlier startds.

--Dan

Wojtek Goscinski wrote:

Howdy,

Can anyone tell me what might be causing the following errors Collector, Sched and Match log? In particular, I'm interested in the "ERROR: received new UDP..." messages. I believe this has started occuring since upgrading to condor 6.8.8 installed through VDT. I'm running under Scientific Linux SL release 5.1 and the pool is managing around a hundred hosts without too many issues.

2/19 16:05:29 DC_AUTHENTICATE: attempt to open invalid session sloth1:4772:1202982643:4157, failing. 2/19 16:05:29 ERROR: receiving new UDP message but found a long message still waiting to be closed (consumed=0). Closing it now.
2/19 16:05:29 WARNING:  No master ad for < vm1@MONASH-F9AD285F >
2/19 16:05:29 StartdAd : Inserting ** "< vm1@MONASH-F9AD285F , 118.138.174.24 <http://118.138.174.24> >" 2/19 16:05:29 stats: Inserting new hashent for 'Start':'vm1@MONASH-F9AD285F':'118.138.174.24 <http://118.138.174.24>' 2/19 16:05:29 StartdPvtAd : Inserting ** "< vm1@MONASH-F9AD285F , 118.138.174.24 <http://118.138.174.24> >" 2/19 16:05:29 stats: Inserting new hashent for 'StartdPvt':'vm1@MONASH-F9AD285F':'118.138.174.24 <http://118.138.174.24>' 2/19 16:05:38 DC_AUTHENTICATE: attempt to open invalid session sloth1:5870:1203380734:1165, failing. 2/19 16:05:39 ERROR: receiving new UDP message but found a long message still waiting to be closed (consumed=0). Closing it now. 2/19 16:05:39 DC_AUTHENTICATE: attempt to open invalid session sloth1:5870:1203380734:1165, failing. 2/19 16:05:39 ERROR: receiving new UDP message but found a long message still waiting to be closed (consumed=0). Closing it now.


Any hints on what might cause this error are most welcome!

Regards,

James


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/