[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor DAGman issue



Hi Zhuo,

I think that Todd's suggestion is probably what will solve the 200 job limit. However there are a few other throttles you should look at:

DAGMAN_MAX_SUBMITS_PER_INTERVAL defines how many jobs get submitted during each submit cycle. The default is 5 which is pretty low. You could try increasing that.

MAX_RUNNING_SCHEDULER_JOBS_PER_OWNER might also be a factor. Are your cron jobs all running under the same user? I don't think this is the problem here, but keep it in mind if previous suggestions don't work.

Mark



On Tue, Sep 11, 2018 at 3:34 PM Todd L Miller <tlmiller@xxxxxxxxxxx> wrote:
> My questions are if there are any other configurations that limit the number
> of DAGman jobs running, and what could cause only 200 DAGman jobs running
> when there are machines unclaimed.

    START_SCHEDULER_UNIVERSE by default limits any individual schedd
to 200 schedd jobs, of which DAGMan jobs are by far the most common type.
If your schedd host can handle the load, feel free to jack that number up.

- ToddM
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison
+1 608 206 4703