[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Understand the working of condor_shared_port



Thanks for your reply Todd.

If I understand correctly, the max number of running jobs from a single sched are limited byÂip_local_port_rangeÂsetting of the Linux machine..Â

Below paragraph from doc [1] mentionedÂabout reducing the ephemeral ports required on the submit node. How exactly does the condor_shared_port daemon help in this reduction?Â

A second benefit of theÂcondor_shared_portÂdaemon is that it helps address the scalability issues of a submit machine. Without theÂcondor_shared_portÂdaemon, more than 2 ephemeral ports per running job are often required, depending on the rate of job completion. There are only 64K ports in total, and most standard Unix installations only allocate a subset of these as ephemeral ports. Therefore, with long running jobs, and with between 11K and 14K simultaneously running jobs, port exhaustion has been observed in typical Linux installations. After increasing the ephemeral port range to its maximum, port exhaustion occurred between 20K and 25K running jobs. Using theÂcondor_shared_portÂdaemon dramatically reduces the required number of ephemeral ports on the submit node where the submit node connects directly to the execute node.ÂIf the submit node connects via CCB to the execute node, no ports are required per running job; only the one port allocated to theÂcondor_shared_portÂdaemon is used.


Removed shared port usage from the box. Restarted condor service to ensure change is in-place. Still condor_shadow opens one ephemeral port for each job. Having condor_shared_port enabled or disable is not showing me any difference.Â

# condor_config_val USE_SHARED_PORT
False


[1]Âhttps://htcondor.readthedocs.io/en/latest/admin-manual/networking.html

Thanks & Regards,
Vikrant Aggarwal


On Sat, Jan 8, 2022 at 2:22 AM Todd L Miller <tlmiller@xxxxxxxxxxx> wrote:
> After reading documentation [1] it's clear that we are using the
> condor_shared_port daemon to optimize TCP connections established by condor
> jobs.

    All that the shared port daemon is does is reduce the number of
ports which must be open in the firewall. The number of TCP connections
is unchanged.

> I started a batch of 20 jobs but I still see condor_shadow processes
> establishing TCP connection with remote nodes.

    This is as expected. The only thing shared port does is make sure
that all of the destination ports are the same. To be clear: the shared
port daemon does NOT multiplex TCP streams. Each connection still
requires an ephemeral port.

- ToddM
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/