[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] reducing job start time



First, Thanks for the response!

Dan Bradley wrote:
>The schedd can quickly recycle claims when one job finishes and there is already another job in the queue with compatible >requirements (and from the same user).  Is your problem that jobs are only submitted after other jobs complete?

I had not yet noticed this behavior. There will be a 5 second interval at which new jobs are inserted,
this is done to spread the number of worker evenly and not have a bursty queue processing. So I guess this will cut the average start-up time to about 3 seconds.

Greg thian wrote:
>set up your schedd as a dedicated scheduler, as per the manual, and submit your jobs to the "parallel" 
>universe with a machine_count of 1.

I will give this a try and see if this cuts down the startup time even more.

Jaimy frey wrote:
* Change NEGOTIATOR_CYCLE_DELAY in the config file. This sets the minimum time between negotiation cycles and defaults to 20 seconds.
Is there a save lower limit to which I can set this? Or any symptoms that indicate I set this too low?


I will submit job groups as much as possible but because the jobs are queue-processors I can only do this so much without getting a very irregular and bursty queue. Also applies to Frederic's answer, I can only do this so much.

As I understand COD it reserves one machine per COD job, this is overkill and this will lead to over-reservation of resources.

With regards,

Jos houtman




Met vriendelijke groet,

Jos Houtman
System administrator Hyves.nl
email: jos@xxxxxxxx


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Frédéric Bastien
Sent: maandag 7 april 2008 15:37
To: Condor-Users Mail List
Subject: Re: [Condor-users] reducing job start time

Hi,

I have see another system that have trouble with micro jobs(bqtools if
I remember correctly). How they tackle this is by grouping micro jobs
together and manage the group as one normal job. Maybe you could do it
yourself by making bigger jobs that group many micro jobs.

Frederic Bastien

On Fri, Apr 4, 2008 at 5:35 PM, Jaime Frey <jfrey@xxxxxxxxxxx> wrote:
> On Apr 4, 2008, at 4:25 AM, Jos Houtman wrote:
>
>  > I am wondering if there are ways to improving the job start time (the
>  > time between submit and actual startup).
>  > My plan is to use condor to run queue-processors, which are
>  > submitted by
>  > a manager that makes sure we keep up with the queue. The manager also
>  > runs in the cluster.
>  >
>  > Because we want to keep queue processing times low, a worker normally
>  > only works on a few queue items.
>  > At the moment this leads to an average runtime of 2 seconds for a
>  > worker.
>  > This makes anticipating and scheduling workers for the manager harder
>  > because the average time from submit to running a worker is about 17
>  > seconds.
>  >
>  > I was wondering if the job start time could be reduced even more?
>  > I already lowered the NEGOTIATER_INTERVAL to 15 seconds and tried
>  > running condor_reschedule after a submit.
>  > The cluster will comprise of about 20 Quad-core nodes, but any
>  > solutions
>  > should also scale to a tenfold of this.
>
>
>  Condor isn't designed to run many 2-second jobs efficiently. But there
>  are a couple things you can try to reduce the queue time of your jobs:
>
>  * Change NEGOTIATOR_CYCLE_DELAY in the config file. This sets the
>  minimum time between negotiation cycles and defaults to 20 seconds.
>
>  * It can take a while for the negotiator to match a job with a
>  machine. But once the job completes, the schedd can immediately run
>  another job on the same machine if more jobs are available. So if you
>  can submit your jobs in large groups, they will execute faster.
>
>  * Take a look at Condor's Computing On Demand (COD). It's a way to
>  give short jobs quick access to your Condor machines. Section 4.3 of
>  the Condor 7.0 manual has more information:
>  http://www.cs.wisc.edu/condor/manual/v7.0/4_3Computing_On.html
>
>  Thanks and regards,
>  Jaime Frey
>  UW-Madison Condor Team
>
>
>
>
>
>  _______________________________________________
>  Condor-users mailing list
>  To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>  subject: Unsubscribe
>  You can also unsubscribe by visiting
>  https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>  The archives can be found at:
>  https://lists.cs.wisc.edu/archive/condor-users/
>
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/