[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Dagman & Job Priorities



The problem with solution to add job priorities is a hard one, mainly due
to the fact that Condor has only 40 priority levels for jobs ( is there
any explanation why that??? ). If it was possible to put any
integer as a priority, this would definitely help, since then every DAG
and its jobs would get its own priority level, enforcing correct FIFO
order ( except for boundaries ). With only 40 priorities available one
should handle a watchdog to monitor and increase periodically
priorities of all the jobs in the DAG, after all jobs of another DAG
finished ( i.e. say you have 3 dags running dag A - prio 1, dag B - prio2,
dag C prio 3. when dag A completes, you move dag B to prio 1, and dag C to
prio 2 ). This is not trivial at all.


On Tue, 8 Feb 2005, Peter F. Couvares wrote:

> Michael S. Root wrote:
> > It is frequently the case for us that a single user is running
> > multiple DAGman jobs.The behavior we get is that all jobs from a
> > user's dags get run concurrently (within the user's resource limits),
> > such that the dags all finish at roughly the same time.It is
> > sometimes the case that a dag with just one job left will sit in the
> > queue for hours waiting for other of the same user's dags (with more
> > unfinished jobs) to 'catch up'.
>
> Yes -- all other things(like requirements, rank, priority, etc.) being
> equal, the condor_schedd will run jobs in FIFO order.So when multiple
> DAGs are running in parallel, and releasing their respective jobs into
> the queue as they become ready, the "first" DAG's most recent job won't
> necessarily be first in the queue.
>
> > What we would like is to have all the jobs from the first dag
> > submitted to finish first, then the second, etc...
>
> Makes sense.If I understand correctly, they don't "have to" finish
> first in order to be correct -- but it's easier for you to follow and
> keep track of everything if they do.
>
> > Since the dags are not necessarily related in terms of what they're
> > processing and usually aren't submitted at the same time, it doesn't
> > make sense to have one dag depend on another.
>
> Strictly speaking, the different DAGs still do not "depend on" one
> another in the scenario you describe -- their jobs just aren't sorted
> in the queue in the order you'd like.If they do depend on each other,
> these dependencies should be represented in another DAG (i.e., a
> higher-level DAG containing your existing DAGs).
>
> > I thought about setting the machine RANK expression to "( -1 *
> > DAGManJobId )", thus the lowest numbered DAGman jobs would be
> > preferred.I haven't tried it yet, though, because I'm not sure if
> > this expression would apply before or after the machine has been
> > matched to a user.
>
> Interesting -- I wouldn't have thought of this, and although it should
> work, it will interfere with condor's attempts to manage user
> priorities.The machine RANK is evaluated by the negotiator before a
> match is made, in order to select the best match.
>
> > Would a job with low user-priority and a low-numbered DAGManJobID get
> > priority over another job with a higher user-priority, but a
> > higher-numbered DAGManJobID?
>
> Yes, exactly -- because Condor respects a resource owner's wishes above
> all, including pool-wide user priority.If a resource owner says their
> machine prefers job X, Condor will not override that just to satisfy
> its attempts at fair-share.This is why I wouldn't use this approach,
> because it's not exactly what you want.
>
> > Even better would be if there were a way to look at the job priority
> > of DAGman itself and have sub-jobs get chosen based on that.I have
> > noticed that changing a DAGman job's priority doesn't have any affect
> > on it's children.
>
> You're right, this would be a nice feature to have as an option -- and
> would solve your problem.I'll see if I can implement it sometime soon
> (i.e., before Condor 6.8.0).Keep an eye on the release notes...
>
> > It wouldn't be hard to write a script to change the priority of all a
> > DAG's children, but it would have to be run repeatedly each time
> > DAGman submits more jobs into the queue (we often run with a -maxjobs
> > limit).
>
> Right -- this is ugly and kludgey and wrong, but could serve as a
> temporary solution, if it's important enough to you.
>
> -Peter
>
> --
> Peter Couvares                    University of Wisconsin-Madison
> Condor Project Research             Department of Computer Sciences
> pfc@xxxxxxxxxxx                     1210 W. Dayton St. Rm #4241
> (608) 265-8936                      Madison, WI 53706-1685
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>