[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Updated version of "Linux Scalability" Condor page



Is there an updated version of this page:

http://www.cs.wisc.edu/condor/condorg/linux_scalability.html

?

In particular, I'm trying to create a 100k node DAG (flat, no dependencies), with MAXJOBS 6000 and I'm getting the error:

**** PANIC -- OUT OF FILE DESCRIPTORS at line 796 in dprintf.c

root # cat /proc/sys/fs/file-max
781235

ijstokes $ ulimit -n
1024

These are in 100k separate classads in 100k directories (in a 2-tier hierarchy groupX/nodeY, so as to avoid overloading a single directory), with 100k log files in each of the node directories.

It takes about 1 hour for the DAG to be submitted. I've bumped up ulmits to a level which should get rid of the problem, but it isn't clear if I need to re-submit the DAG, restart Condor, logout/login, or even reboot the machine to have these changes come into effect. Any advice kindly appreciated.

Regards,

Ian

$ ulimit -H -a
file size               (blocks, -f) unlimited
pending signals                 (-i) 69632
open files                      (-n) 40000
max user processes              (-u) 20000

--
Ian Stokes-Rees, Research Associate
SBGrid, Harvard Medical School
http://sbgrid.org