[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Dagging deeper in priorities and ranks



Hi Armen,

First of all, thanks for the tips!
If you can afford some loss of efficiency and overall speed (I don't know anything about the overall breadth of your DAG tree or resources needed by your DAG nodes), you may want to do what I do in this case: use condor_wait in conjunction with condor_submit[_dag].
The problem for us with this setup is that some of our jobs are way longer than others and it might happen that a few jobs of a DAG run for a very long time while the rest already completed and the available machines could start processing the jobs of the second DAG. If I condor_wait
for the completion of the first DAG I might lose a lot of precious time.
If your application prevents you from using command line tools in a less-than-clumsy way, there's always the option to do the same thing by using condor's SOAP interface to effectively add another metascheduling layer that does the same as above (e.g. keeps a queue of DAGs waiting to start, and submits them in order of DAG completion).
We could definitely use command line tools to manage job submission but I have to keep the dagman
jobs in the queue to be able to monitor the current state of the queue.
In some cases where I have multiple DAGs running, I've tweaked the -maxidle and -maxjobs flags of dagman to get a system which tends to bias towards jobs belonging to older DAGs, and thus recoups efficiency lost by having a strict dag-after-dag ordering mechanism. I did this by bumping up the priority of DAG nodes farther from the root.
This one is a possible solution. Although its way more pain than I initially thought...
Maybe a next-generation dagman will provide the ability to implement easier solutions farther down the road.  :)
I definitely vote for this!


Cheers,
Szabolcs