[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Collector daemon crashing on Windows due to file descriptor limit



The 1024 comes from the compatibility layer that we use to make the Windows and Linux code bases the same, so we canât make it a knob unfortunately.  We would have to change the Windows code base to use a more Windows style mechanism for detecting hot sockets in order to go beyond this limit.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Peet Whittaker
Sent: Wednesday, June 1, 2022 10:16 AM
To: Greg Thain <gthain@xxxxxxxxxxx>
Cc: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Collector daemon crashing on Windows due to file descriptor limit

 

1014 comes from a hard coded limit of 1024 file descriptors for HTCondor on Windows (with a cushion), and there's no easy way to increase that, I fear.

 

Ah, fair enough. I wonder whether this could be made into a config value in the future, as I think Windows can actually support a greater number of file descriptors (up to ~8k); see: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmaxstdio?view=msvc-160

 

But yes, we will look to migrate our CM to Linux (well, just the collector/negotiator parts anyway 😉).

 

As an aside, is there any documentation/literature for the maximum no. of nodes/slots a Windows and/or Linux CM can support? I did find the following page, but it mainly talks about RAM requirements (there is nothing related to file descriptors f.ex.) and is obviously quite old now (2007):

 

https://research.cs.wisc.edu/htcondor/CondorWeek2007/large_condor_pools.html

 

Kind regards,

 

Peet Whittaker

Discipline Lead for DevOps | Principal Software Developer

 

From: Greg Thain <gthain@xxxxxxxxxxx>
Sent: 31 May 2022 22:37
To: Peet Whittaker <Peet.Whittaker@xxxxxxxxxxxxxxxxx>
Subject: Re: [HTCondor-users] Collector daemon crashing on Windows due to file descriptor limit

 

On 5/31/22 16:23, Peet Whittaker wrote:

Hi Greg,

 

Thanks for the quick reply!

 

Ideally, yes (and indeed will be adding Linux execute nodes in the future). However, it would be quite a big task to shift the CM to Linux (there are other processes running on the CM too).

 

Note that the CM in our terminology is just the collector and negotiator process, not the schedd.  I assume that if you split the collector & negotiator onto a linux machine, the other processes on your CM would stay on the Windows schedd/submit machine?

1014 comes from a hard coded limit of 1024 file descriptors for HTCondor on Windows (with a cushion), and there's no easy way to increase that, I fear.

-greg

 

 

 

JBA Consulting, 1 Broughton Park, Old Lane North, Broughton, Skipton, North Yorkshire, BD23 3FD. Telephone: +441756699500

Visit our new website at  www.jbaconsulting.com.

This email is covered by the JBA Consulting email disclaimer
JBA Consulting is a trading name of Jeremy Benn Associates Limited, registered in England, company number 03246693, 1 Broughton Park, Old Lane North, Broughton, Skipton, North Yorkshire, BD23 3FD.

Image removed by sender. JBA CONSULTING