[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Dagman & Job Priorities

Just wanted to add one more tidbit of information that may also be relevant. As per a previous question I asked on this list (http://lists.cs.wisc.edu/archive/condor-users/pre-2004-June/msg00759.shtml), we're using the following configuration "hack" to do resource re-allocation between every job:

	START = ( State != "Claimed" || $(StateTimer) < 120 )

This has worked quite well for us, and additional load on our pool master has been quite negligible. Don't know if it may affect any solution to my current problem...


Michael S. Root wrote:

Hi everyone. I've looked through the Condor documentation and mailing list archive for this, but haven't found what I'm looking for yet.

The problem is this: It is frequently the case for us that a single user is running multiple DAGman jobs. The behavior we get is that all jobs from a user's dags get run concurrently (within the user's resource limits), such that the dags all finish at roughly the same time. It is sometimes the case that a dag with just one job left will sit in the queue for hours waiting for other of the same user's dags (with more unfinished jobs) to 'catch up'.

What we would like is to have all the jobs from the first dag submitted to finish first, then the second, etc... Since the dags are not necessarily related in terms of what they're processing and usually aren't submitted at the same time, it doesn't make sense to have one dag depend on another.

I thought about setting the machine RANK expression to
"( -1 * DAGManJobId )", thus the lowest numbered DAGman jobs would be preferred. I haven't tried it yet, though, because I'm not sure if this expression would apply before or after the machine has been matched to a user. Would a job with low user-priority and a low-numbered DAGManJobID get priority over another job with a higher user-priority, but a higher-numbered DAGManJobID?

Even better would be if there were a way to look at the job priority of DAGman itself and have sub-jobs get chosen based on that. I have noticed that changing a DAGman job's priority doesn't have any affect on it's children. It wouldn't be hard to write a script to change the priority of all a DAG's children, but it would have to be run repeatedly each time DAGman submits more jobs into the queue (we often run with a -maxjobs limit).

Anyone have any clever suggestions on how to implement something like this?


_______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx http://lists.cs.wisc.edu/mailman/listinfo/condor-users