[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] DAGman jobs failing custom requirements



We’re just starting out getting dagman jobs working and have run into a small problem.

Our normal condor_submit jobs work OK, and when I run each individual job in the DAG it works OK, but when I submit the whole DAG the job doesn’t run. The Sched log says it’s failed requirements e.g. “The Requirements attribute for job 7918.0 did not evaluate. Unable to start job” and the job just sits in the queue idle.

If I condor_qedit the requirements and remove the first arg (which is TARGET.Site == MY.Site ) then the DAG runs to completion.

We have this extra ‘Site” attribute as we’re geographically distributed and it’s best to have the users running their jobs locally for better file IO. This is set in each servers condor_config.

 

Any idea why condor_submit works OK with this requirement but condor_submit_dag doesn’t?

My submit files are very basic as this is just a demo to assist our users running DAG jobs. I put together a simple DAG that cat’s a bunch of files, greps each, sorts, then heads the result.

Here’s my .dag and submit files:

 

---------------------------------------

illustrious$ cat demo.dag

# Filename: demo.dag

#

Job  SEARCH_1   search_1.condor

Job  SEARCH_2   search_2.condor

Job  SEARCH_3   search_3.condor

Job  GREP       grep.condor

Job  SORT       sort.condor

Job  HEAD       head.condor

Parent SEARCH_1 SEARCH_2 SEARCH_3 Child GREP

Parent GREP Child SORT

Parent SORT Child HEAD

---------------------------------------

 

illustrious$ cat search_1.condor

executable = search_1.sh

universe = vanilla

output =  search_1.out

error =   search_1.err

log =     search_1.log

 

should_transfer_files = YES

when_to_transfer_output = ON_EXIT_OR_EVICT

 

queue

---------------------------------------

 

illustrious$ cat grep.condor

executable = grep.sh

universe = vanilla

output = grep.out

error =  grep.err

log =    grep.log

 

should_transfer_files = YES

when_to_transfer_output = ON_EXIT_OR_EVICT

transfer_input_files = active.list,archive.list,scratch.list

 

queue

---------------------------------------

illustrious$ cat sort.condor

executable = sort.sh

universe = vanilla

output = sort.out

error =  sort.err

log =    sort.log

 

should_transfer_files = YES

when_to_transfer_output = ON_EXIT_OR_EVICT

transfer_input_files = moragar.list

 

queue

---------------------------------------

illustrious$ cat head.condor

executable = head.sh

universe = vanilla

output = head.out

error =  head.err

log =    head.log

 

should_transfer_files = YES

when_to_transfer_output = ON_EXIT_OR_EVICT

transfer_input_files = sorted.list

 

queue

---------------------------------------

 

Any ideas?

 

Thanx,

 

--Russell


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================