[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] faster condor_submits with dagman



Hello,

I'm running a large amount of short running jobs (2 minutes, maybe?) on a large condor pool. I know, I know, this isn't ideal, not Condor's design, and I should figure out a way to make the jobs longer running. But I want to work on this a little more.
It's a large Condor DAG managing the jobs.

The jobs are able to finish as fast as dagman can submit new ones into the queue, so eventually I go from 1000 idle jobs, and 2000 running, to 10 idle jobs, and 2000 running, and i can't keep the queue full of pending jobs. I've moved the schedd's spool onto a RAMdisk to try and improve throughput, and this helped somewhat but not enough. Any other suggestions to tune to system for a higher rate of job throughput, before I give up and take a different approach?

Here's some of the variables I've been playing with, but with limited success. The machine (schedd and collector/negotiator on the same host) is a 2.4GHz 4-core AMD system with 8GB RAM.


SCHEDD_INTERVAL    = 30
DAGMAN_MAX_JOBS_IDLE = 1000
DAGMAN_SUBMIT_DELAY = 0
DAGMAN_MAX_SUBMITS_PER_INTERVAL = 1000
DAGMAN_USER_LOG_SCAN_INTERVAL = 1
SCHEDD_INTERVAL_TIMESLICE = 0.10
SUBMIT_SKIP_FILECHECKS = True
HISTORY =
NEGOTIATOR_INTERVAL = 30
NEGOTIATOR_MAX_TIME_PER_SUBMITTER=20
NEGOTIATOR_MAX_TIME_PER_PIESPIN=20


Thanks,
Peter