[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] condor_hold and DAGs?
- Date: Thu, 3 May 2007 16:10:03 -0500 (CDT)
- From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
- Subject: Re: [Condor-users] condor_hold and DAGs?
On Thu, 3 May 2007, Armen Babikyan wrote:
> Does condor have a command that I can run that will send a "hold"
> message to a DAG and all sub-DAGs? And similarly release them? Since
> condor_hold operates on cluster_id and process_id, and DAGs seem to run
> "in-band" with respect to Condor Daemons (as "scheduler" processes), I
> wouldn't figure condor/condor_dagman instances have a mechanism that
> (e.g.) sends hold messages to all their children before sending one to
> themselves, but please let me know if I've missed something.
We're working on this, but I don't have a firm ETA for this feature.
It sounds like you're running nested DAGs, right?
For people out there who aren't running nested DAGs, you can do this
pretty easily by doing
condor_hold -const 'DAGManJobId == xyz'
This doesn't work it way down through nested DAGs, though.
> I know this functionality could probably be emulated through a script
> that reads job classads, figures out the DAG job tree underneath a
> particular instance, but that may be somewhat messy and error-prone.
> I'm curious to know if this strategy has worked well for other people.
> Is there a better solution that's internal to condor, or coming out in a
> future release?
> Since I'm writing a feature request, I may as well go all the way: it
> would be nice to have a small dichotomy in commands related to holding DAGs:
> - one command to hold all jobs in a DAG tree the traditional way (i.e.
> running jobs are vacated before being put back on the scheduler's queue)
> - another command to hold all jobs in a DAG tree so that running jobs
> continue running, but all Idle jobs are held, and existing dagman
> instances don't submit new jobs until being released.
We'll take your request into account!
> I'm using version of 6.7.20 of Condor, and haven't had a need to upgrade
> ("if it ain't broke don't fix it"). I've looked through the changelogs
> for new versions, but haven't seen this feature. Please let me know if
> I've missed it!
Nope, unfortunately not!