[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] DAGMAN_ABORT_ON_SCARY_SUBMIT



I just realized that I was submitting the same dagman job several times at the same time. So there were several "D1" job with same run script at the same time.

Now this led me to another question: where to set "DAGMAN_ABORT_ON_SCARY_SUBMIT" ? For the job I am working, during special situation, it's possible that the same dagman job will be submitted before the previous is finished. Is there a way to prevent the second one being accepted by condor in this kind of situation?






On 5/7/2014 2:24 PM, Jiande Wang wrote:
Hi,
I submitted several dagman jobs at almost the same time. Each dagman script is like the following:

Job A ......
Job D1   ..
Job D2 ...
Job D3...
...
Job D24 .....
..
PARENT A CHILD D


the file name of run script for each Job are different from each other.
 I got error message in the dagman output file


ERROR: node D1: job ID in userlog submit event (9025.0.0) doesn't match ID reported earlier by submit command (9021.0.0)! Aborting DAG; set DAGMAN_ABORT_ON_SCARY_SUBMIT to false if you are *sure* this shouldn't cause an abort.


Is this because condor can not handle several "D1" job at the same time although they belong to different dagman?
Any suggestions on this?

Thanks

Jiande Wang