[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] flocking / CCB



Hi Don,

06/09/12 16:39:05 CCBListener: failed to receive message from CCB server 10.178.6.5


Could you provide more logs?  I'm specifically interested in any log message containing CCB.

It also may be helpful to add D_FULLDEBUG and D_COMMAND to COLLECTOR_DEBUG on the machine serving as your CCB server.  This will give you messages when daemons try to register themselves for CCB access.

--Dan

On 6/9/12 4:16 PM, Shrum, Donald C wrote:
I'm trying to get a test job to flock between FSU and USF here in Florida.

As our cluster is on a private network and we have a public IP only on the central manager I added the following to condor_config on the central manager - 

PRIVATE_NETWORK_NAME = fsu-hpc-condor-private
PRIVATE_NETWORK_INTERFACE = 10.178.6.5


I added CCB_ADDRESS and the same PRIVATE_NETWORK_NAME to the processing nodes' condor_config.

So far as I can tell the CCB daemon runs on the collector so I don't need to explicitly set it to run. 



I must be missing something simple in the setup.  I see errors that read - 
06/09/12 16:39:05 CCBListener: failed to receive message from CCB server 10.178.6.5

I ran condor_reconfig on the processing nodes.  Do I need to restart condor on all the nodes as a result of the change?  The error message makes me think not.

Any pointers to debug this would be appreciated.

Thanks for the help.

Don
FSU HPC



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/