We are running 7.8.4.

The below is from a colleague, but basically when we are very busy on our
main submit node
(2000-3000 jobs) we see a problem when a condor_q occurs causing
condor_schedd to fork, which, as it is fairly massive by then can cause us
to run out of memory.

We are buying more memory (cheap), but has anything in this area changed
in 7.8.?

Any thoughts?
Many thanks

The additional condor_schedulers are nothing to do with one scheduler
being overloaded. They are automatically/instantly created whenever a
condor_q command is run - they appear to be copies of the running
scheduler (ie they immediately claim/use the same amount of memory).


ps axu | awk '{mem+=$6} END {print mem}'

on submitter to get an idea of how much memory is required by the [2200]
running processes, the figure returned is around 12Gb - you recall
our submitter only has 8Gb of memory.

Hence simply to support additional processes and adding more Condor nodes,
submitter needs at least 16Gb. Although I would suggest that if the rack
supports it, 24Gb minimum is probably better.

