Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor 6.8.n: job running delays: RUN TIMES stay at Zero

Date: Fri, 11 Dec 2009 03:52:03 +0700
From: Hendra Rahmawan <hrahmawan@xxxxxxxxx>
Subject: Re: [Condor-users] Condor 6.8.n: job running delays: RUN TIMES stay at Zero


-----Original Message-----
From: Daniel Forrest <dan.forrest@xxxxxxxxxxxxx>
Sent: 10 December 2009 06:20
To: Kevin.Buckley@xxxxxxxxxxxxx
Cc: condor-users@xxxxxxxxxxx; Paul.Chote@xxxxxxxxx
Subject: Re: [Condor-users] Condor 6.8.n: job running delays: RUN TIMES stay at Zero

On Thu, Dec 10, 2009 at 10:38:35AM +1300, Kevin.Buckley@xxxxxxxxxxxxx wrote:
> 
> Just to cement your viewpoint in place though, the only other posting
> I have seen, Matt's, seemed to be in reply to yours which makes your
> "As one other poster said" a little tautological!

I am referring to your original post back in October.  Ian Chesal
commented on my post at that time:

https://lists.cs.wisc.edu/archive/condor-users/2009-October/msg00094.shtml

Date: Wed, 14 Oct 2009 07:58:27 -0700
From: Ian Chesal <ICHESAL@xxxxxxxxxx>
Subject: Re: [Condor-users] Condor 6.8.n: job scheduling process delays

> Back to your original question, this is entirely a scalability issue.
> Prior to the 6.9.3 release the schedd simply couldn't handle more than
> a few thousand jobs in the job queue without a severe degradation in
> performance.  I believe your previous message stated you had around
> 17,500 jobs in the queue - this simply won't work with Condor 6.8.

Optionally, if you have only a handful of schedd machines in the pool
you can upgrade them to 7.x.x. I'm running 6.8.6 execution nodes with
7.0.x central machines (negotiator/collector, schedd/quill) without
issue.

- Ian


> If you tell me categorically that I can upgrade the master to a later
> version and not affect the existing grid then that'd enable VUW to
> take a huge step in the right direction at a much quicker pace.

So there is at least one person doing exactly this.

I know that we flock jobs to execute nodes running 6.8.n while our
central manager is 7.1.1.  And now that I look more closely, our Linux
machines are 7.1.1, but our Windows machines (execute only) are 6.8.4.
So it works for us with both flocking and a mixed local pool.

-- 
Daniel K. Forrest		Space Science and
dan.forrest@xxxxxxxxxxxxx	Engineering Center
(608) 890 - 0558		University of Wisconsin, Madison
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a


[The entire original message is not included]

Prev by Date: [Condor-users] Questions/Comments on dynamic slot for SMP computer
Next by Date: Re: [Condor-users] Condor 6.8.n: job running delays: RUN TIMES stay at Zero
Previous by thread: Re: [Condor-users] Questions/Comments on dynamic slot for SMP computer
Next by thread: [Condor-users] Quill problem - Unable to find password from file
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] Condor 6.8.n: job running delays: RUN TIMES stay at Zero