[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] server hardware for a big pool



> Hello all,
> we are currently running all condor daemons on a P3 1Ghz 2GB Mem, with
> ~1700 Clients. As this machine is totally at the edge, we where
wondering,
> what hardware other people with pools of thousands of machines (ours
will
> be 4000+ machines at the end) are using and also how they are
organised.

Do you mean you're running the schedd, negotiator and collector daemons
all on this one machine? Or do you have schedds scattered throughout
your grid and this is just your collector/negotiator machine?

If it's a centralized schedd system than the schedd daemon should be
alone on it's own machine, preferrably one with multiple
processors/cores because it needs a lot of CPU to do scheduling and it
spawns a new process (called a condor_shadow) for every job that runs in
your system. So if this is your only schedd and you've got 4000 VMs
that's 4000 condor_shadow processes. 
 
> Also I would like to know, if we should create one big pool, or maybe
some
> smaller pools (what size) and flock them ? I would tend to go for one
big
> pool, as this is easier for the administration (all the same config
file).

It really depends on what you're trying to achieve. Do you have multiple
schedds in your one big pool approach or just one? You'll probably need
multiple schedds to scale up to 4k VMs. One collector/negotiator machine
can handle a pool this size, but one schedd will probably crumble under
the load.

With some more information I can probably give you less general answers.

- Ian