[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Defusing a Condor bomb?


So I have a cluster of about 80 nodes, with file access for users spread over a handful of servers. I monitor the load on each server and don't let a user's job start if his server's load is too high (everyone uses a central submission script so requirements get added to the .cmd file). The problem is, if a user submits a bunch of jobs to an almost empty queue, they could all start running (and bring the server to its knees) before the load monitor notices. Is there some way to throttle the frequency at which jobs from the same user start executing to prevent this happening?


Chris Green, MiniBooNE / LANL. Email greenc@xxxxxxxx
Tel: (630) 840-2167. Fax: (630) 840-3867