[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Defusing a Condor bomb?
- Date: Wed, 26 Oct 2005 14:59:38 -0400
- From: "Ian Chesal" <ICHESAL@xxxxxxxxxx>
- Subject: Re: [Condor-users] Defusing a Condor bomb?
> > On 10/26/05, Chris Green <greenc@xxxxxxxx> wrote:
> >> Is there some way to throttle the frequency at which jobs from the
> >> user start executing to prevent this happening?
> > same user no (I think). same schedd yes...
> > http://condor.optena.com/display/CONDOR/JOB_START_DELAY
> > note however that the jobs will have been matched so you would have
> > then prevent them from starting - this would be rather inefficient.
> > I do not believe there is any easy way of only allowing x jobs to be
> > negotiated and accepted at a time (which is what you would need to
> > do).
> I'm not sure this would do the trick, since the load check has already
> been done, right (it's in Requirements)? Once the job has been
> it's going to start on some machine whether it takes 1 second or 10,
> still drive the load through the roof. I think what I need is some way
> adding the schedd's time-since-last-match to the requirements clause.
> that something that can be done? If its by schedd rather than by user,
> that's fine: 30 secs should be enough to prevent a load cascade.
This is a great waste of CPU time but: you could have each job randomly
sleep for a few seconds before it started fetching data. That is
assuming your jobs are doing the data transfer and not the condor