[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] reducing job start time



Hi,

I have see another system that have trouble with micro jobs(bqtools if
I remember correctly). How they tackle this is by grouping micro jobs
together and manage the group as one normal job. Maybe you could do it
yourself by making bigger jobs that group many micro jobs.

Frederic Bastien

On Fri, Apr 4, 2008 at 5:35 PM, Jaime Frey <jfrey@xxxxxxxxxxx> wrote:
> On Apr 4, 2008, at 4:25 AM, Jos Houtman wrote:
>
>  > I am wondering if there are ways to improving the job start time (the
>  > time between submit and actual startup).
>  > My plan is to use condor to run queue-processors, which are
>  > submitted by
>  > a manager that makes sure we keep up with the queue. The manager also
>  > runs in the cluster.
>  >
>  > Because we want to keep queue processing times low, a worker normally
>  > only works on a few queue items.
>  > At the moment this leads to an average runtime of 2 seconds for a
>  > worker.
>  > This makes anticipating and scheduling workers for the manager harder
>  > because the average time from submit to running a worker is about 17
>  > seconds.
>  >
>  > I was wondering if the job start time could be reduced even more?
>  > I already lowered the NEGOTIATER_INTERVAL to 15 seconds and tried
>  > running condor_reschedule after a submit.
>  > The cluster will comprise of about 20 Quad-core nodes, but any
>  > solutions
>  > should also scale to a tenfold of this.
>
>
>  Condor isn't designed to run many 2-second jobs efficiently. But there
>  are a couple things you can try to reduce the queue time of your jobs:
>
>  * Change NEGOTIATOR_CYCLE_DELAY in the config file. This sets the
>  minimum time between negotiation cycles and defaults to 20 seconds.
>
>  * It can take a while for the negotiator to match a job with a
>  machine. But once the job completes, the schedd can immediately run
>  another job on the same machine if more jobs are available. So if you
>  can submit your jobs in large groups, they will execute faster.
>
>  * Take a look at Condor's Computing On Demand (COD). It's a way to
>  give short jobs quick access to your Condor machines. Section 4.3 of
>  the Condor 7.0 manual has more information:
>  http://www.cs.wisc.edu/condor/manual/v7.0/4_3Computing_On.html
>
>  Thanks and regards,
>  Jaime Frey
>  UW-Madison Condor Team
>
>
>
>
>
>  _______________________________________________
>  Condor-users mailing list
>  To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>  subject: Unsubscribe
>  You can also unsubscribe by visiting
>  https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>  The archives can be found at:
>  https://lists.cs.wisc.edu/archive/condor-users/
>