[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Restarting comms with XP nodes (SP2 UDP bugette)



Zach,

My bad.  It was a typographical error.  TCP updates appear to be working
now.  Thanks for you help.

-Bryan

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Zachary Miller
Sent: Thursday, June 23, 2005 10:51 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Restarting comms with XP nodes (SP2 UDP
bugette)

On Thu, Jun 23, 2005 at 10:05:11AM -0400, Bryan S. Maher wrote:
> Zach,
> 
> > for this, you need to also define COLLECTOR_SOCKET_CACHE_SIZE.
> 
> Yeah, did that too.  Sorry I wasn't more clear.  I followed the
> procedure in the manual. I put this in my config files: 
> 
> 	COLLECTOR_SOCKET_CACHE_SIZE = 99 and 
> 	UPDATE_COLLECTOR_WITH_TCP   = TRUE  
> 
> I pushed the updated config files out to my master/collector and to
two
> test execution/submit nodes.  I issued condor_reconfig to all three.
I
> immediately began receiving the "...via TCP; ignored" errors.  I tried
> restarting condor on the master first, then on the two
execution/submit
> nodes.  The error persisted.

strange.  i am unable to replicate this.  are you sure the
reconfig/restart
was successful (and didn't get PERMISSION DENIED or anything)?  also,
you can
find out for sure if the collector recognized the setting by turning on
D_FULLDEBUG for COLLECTOR_DEBUG.  you should see a line like this in the
CollectorLog:
  6/23 09:32:23 (fd:8) Using a SocketCache for TCP updates (size: 99)

otherwise, you'll see a line like this:
  6/23 09:31:25 (fd:8) No SocketCache, will refuse TCP updates

if you want to try that, it will help me figure out what's going on.


> Thanks for the "session" info.  I keep my entire pool at the same
> revision level so there is no mixed versioning going on here.
Everyone
> is running 6.6.9 currently.   I'm curious, you said that if the
> collector is restarted, it would invalidate any open sessions.  Does
the
> same hold true in reverse?  That is, if an execution node is
restarted,
> is there a potential for the same problem?

not in the case of the Collector (it doesn't contact the startds) but
the
Negotiator would try to contact a startd if a job was run, and then
would fail
the first time because of the missing session.  then the startd informs
the
negotiator of the invalid session, and next time (next negotiation
cycle) the
negotiator will start a new session and everything will be back to
normal.


> I'll try setting SEC_DEFAULT_NEGOTIATION = OPTIONAL and see if that
> helps.

it should, so long as you aren't using any strong authentication,
crypto,
or integrity checks.  they all require the use of sessions.


cheers,
-zach

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users