[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] POST script user privileges in DAG



On Fri, 6 Feb 2015, Ying W wrote:

      I guess one workaround could be this:  instead of doing your
      file deleting in a POST script, add another node that does
      this.  If the node job is submitted similarly to the process
      jobs, it should end up with and NFS id of nobody:nogroup, and
      thereby be able to delete the files.  In this model, you'd have
      a delete node that would immediately follow each process node. 
      If you do that, the delete node will only be run if the process
      node succeeds.


The submit node has user:group of ubuntu:ubuntu (login that came with the
image im using) however other users could be possible in the future. This
user:group does not have permission to delete the nobody:nogroup files. So
if you use POST script to submit, I would be specifying executable as the
condor_submit then? My main concern with this approach is that the new node
job being submitted would be queued for a while before being run but I guess
i could adjust the prio to be higher, that was the main reason why I wanted
to run it as a POST initially.

No, I'm saying you don't run the cleanup as a post script, you run it as a node, like this:

  Job download1 ...
  Job process1 ...
  Parent download1 Child process1
  Job cleanup1 ...
  Parent process1 Child cleanup1

I've looked into categories before but I couldn't think of a way for it to
work. I might be missing something but I feel like CATEGORIES and LIMIT
serve a similar function even though one is at the job level and the other
is at the dag level.

Maybe the challenges I'm facing would be more clear with this example:

Say I have 100 datasets to process but my NFS cannot hold more than 10 at
once before filing up. I have 4 jobs I want to run on each dataset, download
(D#) -> preprocess (P#) -> calculate (C#) -> summarize (S#)
My DAG would then look something like:

Job D0 download.sub   # single threaded
Job P0 preprocess.sub # requires a lot of memory
Job C0 calculate.sub  # uses lots of cores
Job S0 summarize.sub  # takes a while mostly I/O bound

SCRIPT POST C0 rm_download.sh "<uuid0>" $RETURN
VARS D0 id="<uuid0>"
VARS P0 id="<uuid0>"
VARS C0 id="<uuid0>"
VARS S0 id="<uuid0>"

PARENT D0 CHILD P0
PARENT P0 CHILD C0
PARENT C0 CHILD S0

and then I have this 99 more times for D0 -> D99 with different UUIDs
I have LIMITS on the download.sub to prevent overloading the download server
but what happens is that after each download job is finished, it just
releases its allocation on LIMIT and the next download starts and I don't
see how using categories would change things. Ideally, I would want
something to count towards MAX_JOBS when the download starts and not release
its allocation until the POST script is run. Putting preprocess/calculate as
the same category would not fix things since their resource requirements are
higher so it would just be adding another barrier to being run?

If I'm understanding correctly what you want to do, I think a combination of category throttles and priorities would do what you want. You could do something like this:

  Job D0 download.sub   # single threaded
  Job P0 preprocess.sub # requires a lot of memory
  Job C0 calculate.sub  # uses lots of cores
  Job R0 remove.sub     # cleans up input files
  Job S0 summarize.sub  # takes a while mostly I/O bound

  VARS D0 id="<uuid0>"
  VARS P0 id="<uuid0>"
  VARS C0 id="<uuid0>"
  VARS R0 id="<uuid0>"
  VARS S0 id="<uuid0>"

  PARENT D0 CHILD P0
  PARENT P0 CHILD C0
  PARENT C0 CHILD R0 S0 # remove and summarize can run in parallel?

  MAXJOBS nfs_limit 10
  CATEGORY D0 nfs_limit
  CATEGORY P0 nfs_limit
  CATEGORY C0 nfs_limit
  CATEGORY R0 nfs_limit
  # S0 not here because it doesn't depend on downloaded files

  PRIORITY P0 10
  PRIORITY P0 100
  PRIORITY C0 1000
  PRIORITY R0 10000
  # Not sure about priority for summarize

If you do something like this, your DAG should start out by submitting 10 download jobs. When the first download job finishes, the corresponding preprocess job will be submitted before any more download jobs, because of the higher priority. Then, as you work your way along, calculate jobs will be favored over preprocess jobs, and remove jobs will be the most favored.

(BTW, you can accomplish pretty much the same thing by setting DAGMAN_SUBMIT_DEPTH_FIRST to true instead of using the priorities. Depth-first will also favor summarize jobs over calculate jobs, which I'm not sure you want.)

The challenge I've had with this setup is that condor_submit_dag seems to
submit all the download jobs (D0-D99) at once (so using so using max-idle
will not work_ and then once each download finishes, it will submit the
preprocess jobs (P0-P99). However, once the preprocess jobs finish, it
rarely starts a calculate jobs because not enough resources are available so
it just mostly goes between download and preprocess. The current workaround
I'm thinking about involves splitting up my original DAG into 10 smaller DAG
files.

I think the above category/priority setup should solve this.

Maxidle isn't really what you want for this kind of thing -- it's more to avoid overloading your pool as a whole.

Kent