[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] private networks, submit nodes and flocking



Hi Folks,

This is a restatement of an earlier question, but one I've seen before without an adequate solution.

We have a pool on an internal network, and a workstation pool on an external network, and would jobs on the internal pool to be able to flock to the external pool. (Linux/UNIX machines for now)

The submit nodes on the internal pool all have both internal & external interfaces, and the head nodes of each have both internal & external interfaces, so that negotiation cycles complete successfully, but jobs never start on the external compute nodes.

As I understand it now, condor daemons bind to specific network interfaces, particularly the schedd. This causes the schedd to try to reach an IP on the external network via an internal interface, causing hangs when contacting the schedd via condor_q or during negotiation cycles.

Questions:

Is this assessment of the problem correct?

Why do the condor daemons bind to specific network interfaces and ignore the routing table?

Is there a workaround to this problem, for the specific case where every submit node can see both networks, but are presently bound to a specific network interface (the internal in this case).


Thanx for the input ... this seems a not-uncommon case, so a general solution would probably benefit a good number of users.



rob