[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] possible DAG node priority changes



We're thinking about making some changes to the node priority scheme in DAGMan, and we wanted to get some feedback from users before doing this.

Please let us know if any of the proposed changes to the priority scheme would cause you big problems. (Or, for that matter, if you have ideas for a better scheme!)


Right now a node's priority is determined as follows:

  effective node prio = max(explicit node prio, parents' effective prios,
    DAG prio)
This is really a problem for eventually fixing/re-implementing the "respond to priority change" feature, because it makes things asymmetrical with regard to increasing/decreasing the DAG priority.

The current scheme also causes a problem because a high DAG priority erases relative priority differences of nodes within the DAG, which Igor and I both dislike.

So there are a bunch of possible alternatives we've thrown back and forth:

  1) effective node prio = max(explicit node prio, parents' effective
    prios) + DAG prio
  2) effective node prio = sum(explicit node prio, parents' effective
    prios, DAG prio)
  3) effective node prio = explicit node prio + DAG prio
Number 1 is the smallest change from the way things currently work. But Igor doesn't like the max() function, and I'm not wild about it, either.

Number 2 is probably the most self-consistent -- parent DAGs and parent nodes have the same effect on priorities. But I'm worried that in "long" DAGs priorities will get ridiculously high.

Number 3 is very simple to understand, and avoids the "priority explosion" problem of number 2. The only down side is that things are not consistent between parent nodes and parent DAGs.

I'm currently leaning towards number 3, and Igor says he's okay with that.

But I wanted to get some feedback (and probably run things past htcondor-users) before making changes...

(BTW, I'm open to schemes other than the above 3, but I think we should go with something fairly simple. We actually had a more complex scheme that the current one for a while, and as I recall we went away from it because it was too hard to understand.)

If you want more infor, you can get the full story here:
https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=4024,4
https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=3389

Kent
--
R. Kent Wenger (wenger@xxxxxxxxxxx, 608-262-6627,
http://www.cs.wisc.edu/~wenger/)
Computer Sciences Department
University of Wisconsin-Madison