[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] our ideal configuration
- Date: Mon, 02 Jul 2007 14:05:17 +0200
- From: Horvátth Szabolcs <szabolcs@xxxxxxxxxxxxx>
- Subject: Re: [Condor-users] our ideal configuration
What is your claim timeout?
Well, I have claim_worklife set to 15 minutes and no
****Tiers and priority is used completely separated (as I wrote I was
under the impression that machine rank overrides
job and user priority settings). I'd like to avoid messing with the job
priority because it would make
manual priority setting (for changing the job execution order for a
user) a lot more difficult.
do your users' make sure that the higher
tier jobs always have a higher priority? this is easy to do now that
the priority is an int not plus/minus 20. just add a few million for
Thats ok for us, the pool has very few submitters. And abusers are
killed on the spot. ;)
A wonderful arrangement. I considered getting Mike a gun as a pressie.
We have cowboy rules here on the render-farm... ;))
Well the situation you write about is way more complex than the problems
Lets say a user submits two dagman job. Both dag submits 100 jobs and
in the order of job submission. Now I'd like to raise the tier / rank /
importance of the secondly submitted dag.
So I change the attributes for all jobs of that dag and wait. I expect
that from this moment (or after the next
negotiation cycle) only the jobs of the second dag should start (since
they are preferred by the machine rank)
but this is not what I get. A few tasks do start from the second dag but
quite a few jobs still start running from the first one.
I thought that machine rank overrides user priorities altogether. Is it
kind of - the scheduling algorithm works by requesting jobs bit by bit
from the schedd/user virtual queue. IT was possible under certain
circumstances (say one user takes the whole pool but IIRC this was not
a requirement) that the scheduler says 'that's it no need to go
further down the queue as you would only be comparing against your
jobs that are already running which wouldn't make sense since they
were further down the queue'
This means tiers would have real issues with multiple different tier
structures between different groups of machines.
This may no longer be the case (I haven't been through the pie sharing
logic in a long time so the above might be well out of date)
All submission is done through scripts so submission is pretty much
Hopefully the two points above will help with that...ours is now very
stable in that regard but we ruthlessly enforce the rules (both groups
that use the farm automate the submission through helper applications
- I strongly recommend this approach if you will have several/one 'in
the know' users and others who don't want to learn the intricies and
just want the process to work as roughly a black box.