Turn off machine activity induced preemption/suspension in your condor configuration:


If you are doing any kind of large scale test (e.g. several 1000s or 100000s of jobs in the queue), then the latest development version (6.9.5) is the best Condor version to use unless you want Condor to perform poorly ;-) Although this is a development version, it is the feature-frozen version that will become the beginning of the next stable series near the end of this year, so it is a reasonable choice for comparison.

If you are running more than 200 jobs at a time from a single condor schedd, then you will need to configure MAX_JOBS_RUNNING to be higher than the default of 200.

If you do use a version of Condor from before 6.9.5 (or if you use a configuration file from before 6.9.5), then you will not see speeds of job startup greater than 1 job per 2 seconds, because the default configuration with prior versions of condor was JOB_START_DELAY = 2. In 6.9.5, the throttling of job startup rate has been moved to the file transfer stage (MAX_CONCURRENT_UPLOADS and MAX_CONCURRENT_DOWNLOADS), and I would expect the defaults to be sufficient for most purposes.


Brad Goldsmith wrote:

Hi All,

I have a small cluster of nodes that I am using to do some comparative testing between different distributed computing systems. Condor is one of these that I am starting to do some testing with.

What I'd like to do, to make things as fair as possible, is to force all of the systems to compute pretty much as quickly as they can. WIth condor, I have noticed that jobs seem to sit idle for some time before being computed. I am guessing this is because I've been constantly interrogating clients to see what's going on and this has tripped its suspension policies. When I go for lunch and come back the work is done :-)

What is the best way to make a condor client grab whatever is available and run at it at full steam?



