[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] condor_hold and DAGs?
- Date: Thu, 03 May 2007 16:35:18 -0400
- From: Armen Babikyan <armenb@xxxxxxxxxx>
- Subject: [Condor-users] condor_hold and DAGs?
Does condor have a command that I can run that will send a "hold"
message to a DAG and all sub-DAGs? And similarly release them? Since
condor_hold operates on cluster_id and process_id, and DAGs seem to run
"in-band" with respect to Condor Daemons (as "scheduler" processes), I
wouldn't figure condor/condor_dagman instances have a mechanism that
(e.g.) sends hold messages to all their children before sending one to
themselves, but please let me know if I've missed something.
I know this functionality could probably be emulated through a script
that reads job classads, figures out the DAG job tree underneath a
particular instance, but that may be somewhat messy and error-prone.
I'm curious to know if this strategy has worked well for other people.
Is there a better solution that's internal to condor, or coming out in a
Since I'm writing a feature request, I may as well go all the way: it
would be nice to have a small dichotomy in commands related to holding DAGs:
- one command to hold all jobs in a DAG tree the traditional way (i.e.
running jobs are vacated before being put back on the scheduler's queue)
- another command to hold all jobs in a DAG tree so that running jobs
continue running, but all Idle jobs are held, and existing dagman
instances don't submit new jobs until being released.
I'm using version of 6.7.20 of Condor, and haven't had a need to upgrade
("if it ain't broke don't fix it"). I've looked through the changelogs
for new versions, but haven't seen this feature. Please let me know if
I've missed it!
MIT Lincoln Laboratory
armenb@xxxxxxxxxx . 781-981-1796