[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Dagging deeper in priorities and ranks



Hi,

If you can afford some loss of efficiency and overall speed (I don't know anything about the overall breadth of your DAG tree or resources needed by your DAG nodes), you may want to do what I do in this case: use condor_wait in conjunction with condor_submit[_dag]. For me, so far, it has worked out ok. Micro-management of scheduling (what condor_dagman effectively forces you to do) is hard and takes lots of implementation time, and CPU cycles are comparably cheap (well...sort of :-)).

If your application prevents you from using command line tools in a less-than-clumsy way, there's always the option to do the same thing by using condor's SOAP interface to effectively add another metascheduling layer that does the same as above (e.g. keeps a queue of DAGs waiting to start, and submits them in order of DAG completion).

In some cases where I have multiple DAGs running, I've tweaked the -maxidle and -maxjobs flags of dagman to get a system which tends to bias towards jobs belonging to older DAGs, and thus recoups efficiency lost by having a strict dag-after-dag ordering mechanism. I did this by bumping up the priority of DAG nodes farther from the root.

Maybe a next-generation dagman will provide the ability to implement easier solutions farther down the road. :)

Armen

Horvátth Szabolcs wrote:
Hi,

I'm still trying to force DAG ordered job execution (so that a newly submitted dag's jobs only start after all available jobs of the first dag are completed) and without any success. I completely stripped all priority and rank expressions and tried to get the basics straight:
I thought that if:
- all jobs are submitted by the same user (so user priority is not an issue),
- none of the jobs have priority defined (so they are equally 0 by default),
- and I set NEGOTIATOR_PRE_JOB_RANK or NEGOTIATOR_POST_JOB_RANK to either
(-1 * DAGManJobId) or (5000000 -  DAGManJobId)

I get the dag ordered execution. And of course I do not. ;)

(I used the negotiator rank expression instead of priorities because it does not require the modification of all submit scripts and theoretically it should handle negative numbers. And if can't use negative numbers I have to somehow "limit" the number of jobs to do the substraction and get a positive number for sure. Thats the 5000000 - dagmanjobid version.)

So now I have multiple questions:
- What am I doing wrong? The documentation explicitly states that DAGManJobId can be used in expressions. - How can I check the "results" of the negotiation process? Is there any way to see how all these expressions (user prio, job prio,
job rank, negotiator rank and the rest) came together?

Thanks in advance.

Cheers,
Szabolcs

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/


--
Armen Babikyan
MIT Lincoln Laboratory
armenb@xxxxxxxxxx . 781-981-1796