[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Defusing a Condor bomb?



On 10/26/05, Chris Green <greenc@xxxxxxxx> wrote:
> Hi,
>
> So I have a cluster of about 80 nodes, with file access for users spread
> over a handful of servers. I monitor the load on each server and don't let
> a user's job start if his server's load is too high (everyone uses a
> central submission script so requirements get added to the .cmd file). The
> problem is, if a user submits a bunch of jobs to an almost empty queue,
> they could all start running (and bring the server to its knees) before
> the load monitor notices. Is there some way to throttle the frequency at
> which jobs from the same user start executing to prevent this happening?

same user no (I think). same schedd yes...

http://condor.optena.com/display/CONDOR/JOB_START_DELAY

note however that the jobs will have been matched so you would have to
then prevent them from starting - this would be rather inefficient.

I do not believe there is any easy way of only allowing x jobs to be
negotiated and accepted at a time (which is what you would need to
do).

Matt


> Thanks,
> Chris.
>
> --
> Chris Green, MiniBooNE / LANL. Email greenc@xxxxxxxx
> Tel: (630) 840-2167. Fax: (630) 840-3867
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>