[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How to hold/Release all dag jobs when hold/release dagman job?



On Wed, 21 Aug 2013, 钱晓明 wrote:

I know all dag jobs can be reomved when I condor_rm dagman job, but
hold/release is not the case.
How can I make all jobs held/released according to dagman job status? I
think I should add something in my job submit file.

It's not too hard (assuming you don't have nested DAGs). You do two condor_hold commands -- one to hold the DAGMan job itself, and one to hold the node jobs.

Here's an example:

manta(222)% condor_q

-- Submitter: wenger@xxxxxxxxxxxxxxxxx : <128.105.14.228:51653> : manta.cs.wisc.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
 318.0   wenger          8/21 11:42   0+00:00:29 R  0   1.7  condor_dagman
320.0 wenger 8/21 11:43 0+00:00:03 R 10 0.0 job_dagman_node_pr

2 jobs; 0 completed, 0 removed, 0 idle, 2 running, 0 held, 0 suspended
manta(223)% condor_hold 318
All jobs in cluster 318 have been held
manta(224)% condor_hold -constraint "DAGManJobId==318"
All jobs matching constraint (DAGManJobId==318) have been held
manta(225)% condor_q

-- Submitter: wenger@xxxxxxxxxxxxxxxxx : <128.105.14.228:51653> : manta.cs.wisc.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
 318.0   wenger          8/21 11:42   0+00:00:42 H  0   1.7  condor_dagman
320.0 wenger 8/21 11:43 0+00:00:30 H 10 0.0 job_dagman_node_pr

2 jobs; 0 completed, 0 removed, 0 idle, 0 running, 2 held, 0 suspended
manta(226)%


If you have sub-DAGs, you'll have to do the condor_hold with the constraint for each sub-DAG.

I'm thinking that we should create a command that does this automatically, including handling sub-DAGs...

Kent Wenger
CHTC Team