[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] how to beef up the submit host ?



On Fri, 9 Jun 2006, Dr Ian C. Smith wrote:

<snip>
Question is: if we had the money to beef up the submit host
would it be worth going for a multi-processor (can schedd
work with multiple threads) ? Or would more memory be the
answer (memory usage seems OK at the moment though).

The schedd is single threaded so multi-processor won't really help, except in so far as it may mean that the schedd gets a whole processor all to itself and all the other processes run on the other processor(s).

More memory won't help you, either, in our experience.

Anyone have a handle on what hardware spec would cope with this
kind of thing - I know there are some big Condor installations
out there.

Basically, your problem is not hardware - you are running up against the limitations of the current schedd architecture. You may be able to tune your OS and Condor configuration to get better performance out of the schedd at this sort of load, but you are basically hitting a limitation of the current design.

We have been around this loop many, many times here, with many different configurations. Probably our biggest request of the Condor Team is to re-design the schedd. :)

I think that for large job queues you basically need multiple schedds - that can either be multiple submission points, or multiple schedds on the same machine (in which case a multi-processor machine would help).

Alternatively, you can find ways of limiting the number of jobs in the queue (we wrap the condor_submit script and get aggresive with anyone who fools around to bypass that) - unfortunately Condor doesn't (at least it didn't - maybe this has been added in the very latest 6.7 release?) allow you to limit the number of jobs the schedd has in its queue. <sigh>

There are no easy answers for this one, I'm afraid.

	-- Bruce

--
Bruce Beckles,
e-Science Specialist,
University of Cambridge Computing Service.