[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How use less ports




On 6/11/13 3:11 AM, Romain wrote:
Dan Bradley <dan@...> writes:


Inbound access must be allowed to the shared port.  Outbound access must
be allowed to all ephemeral ports or all ports in a range that you
define by configuring OUT_LOWPORT and OUT_HIGHPORT.  Some guidelines on
how many ports are required may be found here:


http://research.cs.wisc.edu/htcondor/manual/v7.8/3_7Networking_includes.html#
SECTION00471500000000000000
--Dan

Ok, so I can't use 9618 and 9614 as single ports open (in and out) ? with:
9618 for collector and for shared_port
9614 for negotitator

I've to open inbound access to 9618, 9614 and a shared_port (9616 for
example) and open outbound access for all my range port (for example 9600 to
9800) for all nodes, is it that ?

That sounds correct. I can't think of any good reason to have the negotiator use its own fixed port. It should be easiest to just have it use the shared port, which should be the default behavior if you enable shared_port. The collector can also be configured to use shared_port if you define COLLECTOR_HOST appropriately throughout the pool.


The range port that will be defined by the number of max job can be run ?

Yes. For outgoing connections from the submit machine, you need a port range (OUT_LOWPORT to OUT_HIGHPORT) that scales with the maximum number of running jobs. Other nodes can have a smaller port range. If the large outbound port range is a problem, you can configure the execute nodes to use CCB. This causes all connections between the submit and execute nodes to be in the direction execute --> submit, which reduces the number of required outbound ports on the submit machine.

I'd recommend not complicated things unless you really have to.



For resume what I can do:

All nodes can join the pool, when I do condor_status all nodes appear
When I submit jobs from the only execute and submit network (network 1), I
can see these on the Master with condor_q -global
When I submit jobs from the execute/submit and master network (network 2) I
can't see jobs with condor_q -global from the master, i've this :

-- Failed to fetch ads from: <X.X.X.228:9618?sock=15911_c3e8_3> :
THE_SUBMIT_MACHINE (X.X.X.228)
CEDAR:6001:Failed to connect to <X.X.X.228:9618?sock=15911_c3e8_3>

Jobs submit from network 1, run on network 1 machines, jobs from network 2
apprently never start to run...

It is not clear to me what configuration you are using when you observe the above behavior.

--Dan