[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preventing users from submitting too many jobs



What motivates this requirement to limit the number of jobs, is it the
resources consumed by the queued jobs?

The main motivation is to avoid stability problems that we experienced when a user accidentally submitted 130,000+ jobs. Basically, we ended up with about 150,000 jobs in the queue, and that resulted in a number of problems that amplified each other to create a significant headache: - condor_q timeouts caused monitoring and maintenance scripts to malfunction and users to complain - "condor_rm USER" wouldn't work, probably because USER had too many jobs, so we had to script around that - once we managed to condor_rm the jobs, we ran into problems with condor_schedd getting stuck in I/O wait (I am guessing because all those jobs had logs on slow network storage, but not sure)




Vlad



On 05/27/15 09:05, Miron Livny <miron@xxxxxxxxxxx> wrote:
Vlad,

What motivates this requirement to limit the number of jobs, is it the
resources consumed by the queued jobs?

Miron



On 5/27/2015 8:30 AM, Vladimir Brik wrote:
Hello,

I would like to have a mechanism to prevent a user from submitting new
jobs if he or she already has a large number of jobs in the queue (e.g.
 >20,000).

SUBMIT_MAX_PROCS_IN_CLUSTER and MAX_JOBS_SUBMITTED help, but don't solve
the problem completely because they are global limits, and I'd like to
have per-user limits.

What I am thinking of doing is replacing condor_submit with a wrapper
that runs condor_q to check if the user submitted too many jobs, and if
so, drops a config file with appropriate SUBMIT_REQUIREMENT_* to block
the user from submitting more. The wrapper would be clever enough to
skip condor_q if its called too frequently.

Does anybody know of a simpler way to have per-user limits on the number
of jobs in queue?

Is there a way to do something similar for jobs submitted using python
bindings and/or remote submissions?


Thanks,

Vlad
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/