[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Defusing a Condor bomb?



> > On 10/26/05, Chris Green <greenc@xxxxxxxx> wrote:
> <snip>
> >
> >> Is there some way to throttle the frequency at which jobs from the
same
> >> user start executing to prevent this happening?
> >
> > same user no (I think). same schedd yes...
> >
> > http://condor.optena.com/display/CONDOR/JOB_START_DELAY
> >
> > note however that the jobs will have been matched so you would have
to
> > then prevent them from starting - this would be rather inefficient.
> >
> > I do not believe there is any easy way of only allowing x jobs to be
> > negotiated and accepted at a time (which is what you would need to
> > do).
> 
> I'm not sure this would do the trick, since the load check has already
> been done, right (it's in Requirements)? Once the job has been
matched,
> it's going to start on some machine whether it takes 1 second or 10,
and
> still drive the load through the roof. I think what I need is some way
of
> adding the schedd's time-since-last-match to the requirements clause.
Is
> that something that can be done? If its by schedd rather than by user,
> that's fine: 30 secs should be enough to prevent a load cascade.

This is a great waste of CPU time but: you could have each job randomly
sleep for a few seconds before it started fetching data. That is
assuming your jobs are doing the data transfer and not the condor
daemons.

- Ian